Expand description
Pandas-like dataframe & series.
§Series
§1. Declare Series
- To declare series, you should have
Vec<T>
whereT
is one of following types.
Primitive type | DType |
---|---|
usize | USIZE |
u8 | U8 |
u16 | U16 |
u32 | U32 |
u64 | U64 |
isize | ISIZE |
i8 | I8 |
i16 | I16 |
i32 | I32 |
i64 | I64 |
f32 | F32 |
f64 | F64 |
bool | Bool |
char | Char |
String | Str |
- If you prepare
Vec<T>
, thenSeries::new(Vec<T>)
§2. Methods for Series
-
TypedVector<T> trait for Series
ⓘpub trait TypedVector<T> { fn new(v: Vec<T>) -> Self; fn to_vec(&self) -> Vec<T>; fn as_slice(&self) -> &[T]; fn as_slice_mut(&mut self) -> &mut [T]; fn at_raw(&self, i: usize) -> T; fn push(&mut self, elem: T); }
-
Series
methodsⓘimpl Series { pub fn at(&self, i: usize) -> Scalar; pub fn len(&self) -> usize; pub fn to_type(&self, dtype: DType) -> Series; pub fn as_type(&mut self, dtype: DType); }
at
is simple getter forSeries
. It returnsScalar
.as_type
is a method for mutable type casting.- All types can be changed to
Str
. - All integer & float types can be exchanged.
Bool, Char
can be changed toStr
orU8
only.U8
can be changed to all types.
- All types can be changed to
§3. Example
extern crate peroxide;
use peroxide::fuga::*;
fn main() {
let a = Series::new(vec![1, 2, 3, 4]);
let b = Series::new(vec!['a', 'b', 'c', 'd']);
let mut c = Series::new(vec![true, false, false, true]);
a.print(); // print for Series
b.dtype.print(); // print for dtype of Series (=Char)
c.as_type(U8); // Bool => U8
assert_eq!(c.dtype, U8);
}
§DataFrame
§1. Declare DataFrame
- To declare dataframe, use constructor.
DataFrame::new(Vec<Series>)
extern crate peroxide;
use peroxide::fuga::*;
fn main() {
// 1-1. Empty DataFrame
let mut df = DataFrame::new(vec![]);
// 1-2. Push Series
df.push("a", Series::new(vec![1, 2, 3, 4]));
df.push("b", Series::new(vec![0.1, 0.2, 0.3, 0.4]));
df.push("c", Series::new(vec!['a', 'b', 'c', 'd']));
// 1-3. Print
df.print();
// 2-1. Construct Series first
let a = Series::new(vec![1, 2, 3, 4]);
let b = Series::new(vec![0.1, 0.2, 0.3, 0.4]);
let c = Series::new(vec!['a', 'b', 'c', 'd']);
// 2-2. Declare DataFrame with exist Series
let mut dg = DataFrame::new(vec![a, b, c]);
// 2-3. Print or Set header
dg.print(); // But header: 0 1 2
dg.set_header(vec!["a", "b", "c"]); // Change header
}
§2. Methods for DataFrame
-
DataFrame
methodⓘimpl DataFrame { pub fn new(v: Vec<Series>) -> Self; pub fn header(&self) -> &Vec<String>; pub fn header_mut(&mut self) -> &mut Vec<String>; pub fn set_header(&mut self, new_header: Vec<&str>); pub fn push(&mut self, name: &str, series: Series); pub fn drop(&mut self, col_header: &str); pub fn row(&self, i: usize) -> DataFrame; pub fn spread(&self) -> String; pub fn as_types(&mut self, dtypes: Vec<DType>); }
push(&mut self, name: &str, series: Series)
: push head & Series pairdrop(&mut self, col_header: &str)
: drop specific column by headerrow(&self, i: usize) -> DataFrame
: Extract $i$-th row as new DataFrame
-
WithCSV
traitⓘpub trait WithCSV: Sized { fn write_csv(&self, file_path: &str) -> Result<(), Box<dyn Error>>; fn read_csv(file_path: &str, delimiter: char) -> Result<Self, Box<dyn Error>>; }
csv
feature should be required
// Example for CSV #[macro_use] extern crate peroxide; use peroxide::fuga::*; fn main() -> Result<(), Box<dyn Error>> { // Write CSV let mut df = DataFrame::new(vec![]); df.push("a", Series::new(vec!['x', 'y', 'z'])); df.push("b", Series::new(vec![0, 1, 2])); df.push("c", Series::new(c!(0.1, 0.2, 0.3))); df.write_csv("example_data/doc_csv.csv")?; // Read CSV let mut dg = DataFrame::read_csv("example_data/doc_csv.csv", ',')?; dg.as_types(vec![Char, I32, F64]); assert_eq!(df, dg); Ok(()) }
-
WithNetCDF
traitⓘpub trait WithNetCDF: Sized { fn write_nc(&self, file_path: &str) -> Result<(), Box<dyn Error>>; fn read_nc(file_path: &str) -> Result<Self, Box<dyn Error>>; fn read_nc_by_header(file_path: &str, header: Vec<&str>) -> Result<Self, Box<dyn Error>>; }
nc
feature should be requiredlibnetcdf
dependency should be requiredChar
,Bool
are saved asU8
type. Thus, for readingChar
orBool
type nc file, explicit type casting is required.
#[macro_use] extern crate peroxide; use peroxide::fuga::*; fn main() -> Result<(), Box<dyn Error>> { // Write netcdf let mut df = DataFrame::new(vec![]); df.push("a", Series::new(vec!['x', 'y', 'z'])); df.push("b", Series::new(vec![0, 1, 2])); df.push("c", Series::new(c!(0.1, 0.2, 0.3))); df.write_nc("example_data/doc_nc.nc")?; // Read netcdf let mut dg = DataFrame::read_nc("example_data/doc_nc.nc")?; dg["a"].as_type(Char); // Char, Bool are only read/written as U8 type assert_eq!(df, dg); Ok(()) }
-
WithParquet
traitⓘpub trait WithParquet: Sized { fn write_parquet(&self, file_path: &str, compression: CompressionOptions) -> Result<(), Box<dyn Error>>; fn read_parquet(file_path: &str) -> Result<Self, Box<dyn Error>>; }
parquet
feature should be requiredChar
is saved withString
type. Thus, for readingChar
type parquet file, the output type isString
.- Caution : For different length
Bool
type column, missing values are filled withfalse
.
#[macro_use] extern crate peroxide; use peroxide::fuga::*; fn main() -> Result<(), Box<dyn Error>> { // Write parquet let mut df = DataFrame::new(vec![]); df.push("a", Series::new(vec!['x', 'y', 'z'])); df.push("b", Series::new(vec![0, 1, 2])); df.push("c", Series::new(c!(0.1, 0.2, 0.3))); df.write_parquet("example_data/doc_pq.parquet", CompressionOptions::Uncompressed)?; // Read parquet let mut dg = DataFrame::read_parquet("example_data/doc_pq.parquet")?; dg["a"].as_type(Char); // Char is only read/written as String type assert_eq!(df, dg); Ok(()) }
Structs§
- Generic
DataFrame
structure - Generic Scalar
- Generic Series
Enums§
- Data Type enum
- Vector with
DType
- Scalar with
DType
Traits§
- To handle CSV file format
- To handle with NetCDF file format
- To handle parquet format