TensorFlow: Pandas

pandasingests data.

Primary structures are:

  • DataFrame(matrix data)
  • Series¬†(single column)

Install with pipor homebrew: https://stackoverflow.com/questions/13249135/installing-pandas-on-mac-osx

Read in data:

from __future__ import print_function

import pandas as pd
pd.__version__

from __future__ import print_function

import pandas as pd
pd.__version__

california_housing_dataframe = pd.read_csv("https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv", sep=",")
california_housing_dataframe.describe()

should give:

longitude latitude housing_median_age ... households median_income median_house_value
count 17000.000000 17000.000000 17000.000000 ... 17000.000000 17000.000000 17000.000000
mean -119.562108 35.625225 28.589353 ... 501.221941 3.883578 207300.912353
std 2.005166 2.137340 12.586937 ... 384.520841 1.908157 115983.764387
min -124.350000 32.540000 1.000000 ... 1.000000 0.499900 14999.000000
25% -121.790000 33.930000 18.000000 ... 282.000000 2.566375 119400.000000
50% -118.490000 34.250000 29.000000 ... 409.000000 3.544600 180400.000000
75% -118.000000 37.720000 37.000000 ... 605.250000 4.767000 265000.000000
max -114.310000 41.950000 52.000000 ... 6082.000000 15.000100 500001.000000

[8 rows x 9 columns]

https://developers.google.com/machine-learning/crash-course/prereqs-and-prework