This page is mostly for me as a handy reference for all those Python commands I tend to forget. That said, if it proves helpful to any others, all the better!
Lists
Create list by slicing items of another list:
b = [x[:8] for x in cusips_compustat]
Combine two lists into a dictionary:
dict(zip([1,2,3,4], [a,b,c,d]))
Get list of all files in a directory:
filenames = next(os.walk(directory_name))[2]
Find items in one list that are not in another:
a = [some list] b = [another list] c = [] for bx in b: if bx not in a: c.append(bx) if c: print "these are the list elements of b not present in a:", c else: print "no elements of list b are not in list a"
Dictionaries
Merge two dictionaries (same keys in both dictionaries):
d1 = nx.betweenness_centrality(G) d2 = nx.degree_centrality(G) finaldict = {key:(d1[key], d2[key]) for key in d1}
Same as above, but values as list instead of tuple:
finaldict = {key:[d1[key], d2[key]] for key in d1}
For converting to a PANDAS dataframe, you would normally want something like this — create a dictionary of dictionaries and add keys:
finaldict = {key:{'degree': d2[key], 'betweenness': d1[key]} for key in d2}
To convert the above dictionary to a dataframe:
df = pd.DataFrame.from_dict(finaldict, orient='index')
And you would want this if you are building a dictionary with more than one month:
finaldict = {key:{month:{'degree': D2[key], 'betweenness': d1[key]}} for key in d2}
To merge the above nested dictionary, you have to do something more complicated to convert to a DF:
ticks = [] frames = [] for i, d in finaldict.iteritems(): ticks.append(i) frames.append(pd.DataFrame.from_dict(d, orient='index')) df = pd.concat(frames, keys=ticks)
Sort dictionary by keys:
for key in sorted(mydict.iterkeys()): print "%s: %s" % (key, mydict[key])
Sort dictionary by numerical values:
for key, value in sorted(mydict.iteritems(), key=lambda (k,v): (v,k), reverse=True): print "%s: %s" % (key, value)
Regular Expressions
Find all instances of text in string starting with ‘Item ‘:
itemx = soup(text=re.compile('Item '))
Find all instances of text that start with ‘Item ‘ plus any number(s) then ‘.’ then any number(s):
match = re.search('Item (\d+).(\d+)', data) #match = re.search('Item (\d+).(\d+)', data, re.IGNORECASE) #case-insensitive search if match: item = match.group()
Miscellaneous
Loop over directory and find and open specific file:
indir = '/Users/BobSmith/Documents/SEC filings' for root, dirs, filenames in os.walk(indir): for f in filenames: if f == '8-K.htm': file = indir+'/'+f data = open(os.path.join(root, f), 'r').read()
Rename a file:
import os os.rename(dir_name+old_filename, dir_name+new_filename)