tslumen.cli package

CLI for tslumen.

class tslumen.cli.CLIConfig(defaults: List[Any] = <factory>, hydra: Any = <factory>, input: str = '???', output: Optional[str] = None, reader: tslumen.cli.readers.Reader = '???', profiler: Any = <factory>, scheduler: tslumen.scheduling.Scheduler.Config = Scheduler.Config(n_jobs=-2, prefer='processes', verbose=0, timeout=None, backend=None, pre_dispatch='2 * n_jobs', batch_size='auto', temp_folder=None, max_nbytes='1M', mmap_mode='r', require=None, progress_disable=False))[source]

Bases: object

Configurations for the CLI.

defaults: List[Any]
hydra: Any
input: str = '???'
output: Optional[str] = None
profiler: Any
reader: tslumen.cli.readers.Reader = '???'
scheduler: tslumen.scheduling.Scheduler.Config = Scheduler.Config(n_jobs=-2, prefer='processes', verbose=0, timeout=None, backend=None, pre_dispatch='2 * n_jobs', batch_size='auto', temp_folder=None, max_nbytes='1M', mmap_mode='r', require=None, progress_disable=False)
tslumen.cli.main(cfg: tslumen.cli.CLIConfig)None[source]

CLI entrypoint, takes in all the configurations, instantiates a scheduler, reads the input data into a DataFrame using the supplied Reader, profiles the data using the DefaultProfiler, the results of which are supplied to HtmlReport and then write the HTML to the output file or stream.

Parameters

cfg (CLIConfig) – CLI configurations.

tslumen.cli.readers module

Dataframe readers (mostly wrappers around Pandas read_ functions).

class tslumen.cli.readers.Reader(**kwargs: Any)[source]

Bases: abc.ABC

Base class for all readers.

path: str = '???'
class tslumen.cli.readers.ReaderCsv(**kwargs: Any)[source]

Bases: tslumen.cli.readers.Reader

CSV file reader.

comment: Optional[str] = None
compression: str = 'infer'
decimal: str = '.'
delim_whitespace: bool = False
delimiter: Optional[str] = None
doublequote: bool = True
encoding: Optional[str] = None
escapechar: Optional[str] = None
header: Any = 'infer'
index_col: int = 0
lineterminator: Optional[str] = None
nrows: Optional[int] = None
prefix: Optional[str] = None
quotechar: str = "'"
quoting: int = 0
read()pandas.core.frame.DataFrame[source]
sep: str = ','
skip_blank_lines: bool = True
skipfooter: int = 0
skipinitialspace: bool = False
skiprows: Optional[int] = None
thousands: Optional[str] = None
class tslumen.cli.readers.ReaderExcel(**kwargs: Any)[source]

Bases: tslumen.cli.readers.Reader

Excel file reader.

comment: Optional[str] = None
header: Any = 0
index_col: int = 0
names: Optional[List[str]] = None
nrows: Optional[int] = None
read()pandas.core.frame.DataFrame[source]
sheet_name: Any = 0
skipfooter: int = 0
skiprows: Optional[int] = None
thousands: Optional[str] = None
class tslumen.cli.readers.ReaderFwf(**kwargs: Any)[source]

Bases: tslumen.cli.readers.Reader

Fixed-width formatted file reader.

colspecs: Any = 'infer'
infer_nrows: int = 100
read()pandas.core.frame.DataFrame[source]
sep: str = ','
widths: Any = None