Anacondacon 2020

What's new in pandas?, Tom Augspurger

Pandas + Numba support for numba-jitted .rolling().apply. Expanding to .transform(), .aggregate() and .apply()

https://github.com/TomAugspurger/acon-2020-pandas/blob/master/Numba%20Acceleration.ipynb

import numpy as np

import pandas as pd



def mad(x):

return np.fabs(x - x.mean()).mean()



df = pd.DataFrame({"A": np.random.randn(100_000)},

index=pd.date_range('1/1/2000', periods=100_000,

freq='T')).cumsum()


df.rolling(10).apply(mad) # Traditional


df.rolling(10).apply(mad, engine="numba", raw=True) # numba


Added nullable integer, boolean and string types

https://github.com/TomAugspurger/acon-2020-pandas/blob/master/Extension%20Arrays.ipynb

dtype="string"

df.select_dtypes(include="string") - can be good for transformations on columns

dtype="Int64"

dtype="boolean"

Can opt in today using

df.convert_dtypes()

pd.read_csv(..., use_nullable_dtypes=True)

pd.options.use_nullable_dtypes - to opt in globally

pd.NA scalar to indicate missing data.

<NA> means don't know it

Add FloatArray may add to TimestampArray


Python Data Visualization - Huge leaps forward!, James Bednar

http://holoviz.org/ - Tools maintained by anaconda

https://pyviz.org/ - Python viz

Since last year:

  • Widget library unification (Bokeh/Panel/ipywidgets)

  • Annotators (Plotly/Bokeh/HoloViews) - collect user input on your plot

Dashboards:

  • Dash

  • Panel

  • Voila

  • Streamlit

Dash - plotly 2016. Need quite a bit of knowledge about HTML/CSS

jupyter-dash now allows incremental dashboard building

https://dash-gallery.plotly.host/dash-oil-and-gas/

Panel - anaconda - Zero cost to switch from interactive prototype to deployed app and back.

East static HTML/CCS output with live widgets.

Small users, large data. can output HTML/CCS

https://glaciers.pyviz.demo.anaconda.com/glaciers (https://github.com/pyviz-demos/glaciers )

Voila - jupyter notebook into a shareable dashboard

deploys full jupyter kernal (not scalable)

to build complex layouts, use ipywidgets or templates

https://voila-gpx-viewer.pyviz.demo.anaconda.com/

Streamlit - python script into dashboard

re-runs entire script on any interaction

https://voila-gpx-viewer.pyviz.demo.anaconda.com/


https://ipyvolume.readthedocs.io/en/latest/

Datashader and Vaex render data server side

https://datashader.org/ - handle big data.


https://github.com/bqplot/bqplot

https://pyviz.org/tools.html#high-level-shared-api

Holoviews offers .df and .xr accessor to call pandas or xarray method directly


https://examples.pyviz.org/ml_annotators/ - to draw points on a map and get data


https://github.com/makepath/xarray-spatial

https://github.com/holoviz/spatialpandas

https://github.com/rapidsai/cuspatial

https://awesome-panel.org/