Anacondacon 2020
What's new in pandas?, Tom Augspurger
Pandas + Numba support for numba-jitted .rolling().apply. Expanding to .transform(), .aggregate() and .apply()
https://github.com/TomAugspurger/acon-2020-pandas/blob/master/Numba%20Acceleration.ipynb
import numpy as np
import pandas as pd
def mad(x):
return np.fabs(x - x.mean()).mean()
df = pd.DataFrame({"A": np.random.randn(100_000)},
index=pd.date_range('1/1/2000', periods=100_000,
freq='T')).cumsum()
df.rolling(10).apply(mad) # Traditional
df.rolling(10).apply(mad, engine="numba", raw=True) # numba
Added nullable integer, boolean and string types
https://github.com/TomAugspurger/acon-2020-pandas/blob/master/Extension%20Arrays.ipynb
dtype="string"
df.select_dtypes(include="string") - can be good for transformations on columns
dtype="Int64"
dtype="boolean"
Can opt in today using
df.convert_dtypes()
pd.read_csv(..., use_nullable_dtypes=True)
pd.options.use_nullable_dtypes - to opt in globally
pd.NA scalar to indicate missing data.
<NA> means don't know it
Add FloatArray may add to TimestampArray
Python Data Visualization - Huge leaps forward!, James Bednar
http://holoviz.org/ - Tools maintained by anaconda
https://pyviz.org/ - Python viz
Since last year:
Widget library unification (Bokeh/Panel/ipywidgets)
Annotators (Plotly/Bokeh/HoloViews) - collect user input on your plot
Dashboards:
Dash
Panel
Voila
Streamlit
Dash - plotly 2016. Need quite a bit of knowledge about HTML/CSS
jupyter-dash now allows incremental dashboard building
https://dash-gallery.plotly.host/dash-oil-and-gas/
Panel - anaconda - Zero cost to switch from interactive prototype to deployed app and back.
East static HTML/CCS output with live widgets.
Small users, large data. can output HTML/CCS
https://glaciers.pyviz.demo.anaconda.com/glaciers (https://github.com/pyviz-demos/glaciers )
Voila - jupyter notebook into a shareable dashboard
deploys full jupyter kernal (not scalable)
to build complex layouts, use ipywidgets or templates
https://voila-gpx-viewer.pyviz.demo.anaconda.com/
Streamlit - python script into dashboard
re-runs entire script on any interaction
https://voila-gpx-viewer.pyviz.demo.anaconda.com/
https://ipyvolume.readthedocs.io/en/latest/
Datashader and Vaex render data server side
https://datashader.org/ - handle big data.
https://github.com/bqplot/bqplot
https://pyviz.org/tools.html#high-level-shared-api
Holoviews offers .df and .xr accessor to call pandas or xarray method directly
https://examples.pyviz.org/ml_annotators/ - to draw points on a map and get data
https://github.com/makepath/xarray-spatial
https://github.com/holoviz/spatialpandas