scispace - formally typeset
Open AccessProceedings ArticleDOI

Data Structures for Statistical Computing in Python

Wes McKinney
- pp 56-61
Reads0
Chats0
TLDR
P pandas is a new library which aims to facilitate working with data sets common to finance, statistics, and other related fields and to provide a set of fundamental building blocks for implementing statistical models.
Abstract
In this paper we are concerned with the practical issues of working with data sets common to finance, statistics, and other related fields. pandas is a new library which aims to facilitate working with these data sets and to provide a set of fundamental building blocks for implementing statistical models. We will discuss specific design issues encountered in the course of developing pandas with relevant examples and some comparisons with the R language. We conclude by discussing possible future directions for statistical computing and data analysis using Python.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted ContentDOI

Creating Artificial Human Genomes Using Generative Models

TL;DR: Deep generative adversarial networks and restricted Boltzmann machines are trained to learn the high dimensional distributions of real genomic datasets and create artificial genomes (AGs), which have the potential to become valuable assets in genetic studies by providing high quality anonymous substitutes for private databases.
Proceedings ArticleDOI

On Feature Learning in the Presence of Spurious Correlations

TL;DR: This paper evaluates the amount of information about the core (non-spurious) features that can be decoded from the representations learned by standard empirical risk minimization (ERM) and specialized group robustness training and finds that strong regularization is not necessary for learning high quality feature representations.
Journal ArticleDOI

pyrolite: Python for geochemistry

TL;DR: Ppyrolite provides tools for processing, transforming and visualising geochemical data from common tabular formats that provide a foundation for preparing data for subsequent machine learning applications using scikit-learn.
Journal ArticleDOI

hynet: An Optimal Power Flow Framework for Hybrid AC/DC Power Systems

TL;DR: Hynet as mentioned in this paper is a Python-based open-source optimal power flow (OPF) framework for hybrid AC/DC grids with point-to-point and radial multi-terminal HVDC systems.
Journal ArticleDOI

Inverse Problems in Asteroseismology

TL;DR: New techniques to measure the ages, masses, and radii of stars are presented, as well as a way to infer their internal structure.
Related Papers (5)