I feel like I'm missing something fundamentally here. I have a Pandas DataFrame like this:

df = pd.DataFrame(list(range(3)).T
df.columns = ['a.first', 'a.second', 'b']

#    a.first  a.second  b
# 0        0         1  2

What I would like to create is a MultiIndex DataFrame where I can use df.a, df.a.first and df.b. What I got so far is the str split method:

a.columns = a.columns.str.split('.', expand=True)
#        a            b
#    first  second  NaN
# 0      0       1    2

So obviously the NaN is a problem here, because to access value b, one would need to call df.b[np.nan], which feels obviously wrong.

Starting from here, all the solutions that come to my mind start feeling like workaround where I iterate over the columns and try to replace the NaNs with empty strings. I imagine that there must be a much more straightforward way, as I guess that this is a pretty common problem, no?

Edit: The least ugly solution that came to mind so far is the following:

def apply_multiindex(df, hier_sep='.'):
    depths = df.columns.str.split(hier_sep).map(len)
    add_hiers = max(depths)-depths
    df.columns = [column + hier_sep*add_hier[c]
                  for c, column in enumerate(df.columns)]
    df.columns = df.columns.str.split(hier_sep, expand=True)

#        a          b
#    first  second  
# 0      0       1  2

I'm still looking forward to a more cleaner solution :)

For me working rename with missing value, because fillna for MultiIndex is not implemented:

df = pd.DataFrame([list(range(3))], columns = ['a.first', 'a.second', 'b'])
df.columns = df.columns.str.split('.', expand=True)

df = df.rename(columns = {np.nan:''})
print (df)
      a         b
  first second   
0     0      1  2
