randomdataset package¶

Submodules¶

randomdataset.application module¶

randomdataset.application.print_csv_test()¶: Simple test routine which creates a schema, generates a small CSV file, and prints it to stdout.

randomdataset.csvgenerator module¶

class randomdataset.csvgenerator.CSVGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, write_header: bool = True)¶

Bases: randomdataset.generators.generator.DataGenerator

write_stream(stream: IO)¶: Write a dataset to the given file-like object.

randomdataset.dataset module¶

class randomdataset.dataset.Dataset(name: str, fields: Iterable[randomdataset.fields.fieldgen.FieldGen])¶

Bases: object

append_shared_state(state_name: str, data: Any)¶: Append data to the list mapped to state_name, adding the new list if not present.

property field_names¶: Get a tuple of names of the stored fields.

property field_types¶: Get a tuple of types of the stored fields.

property fields¶: Returns a tuple of the stored FieldGen objects.

get_field(name: str) → randomdataset.fields.fieldgen.FieldGen ¶: Get the field of the given name, raising ValueError if not found.

get_field_data(name: str, length: Optional[int] = None) → Any¶: Get a value, array of values if length provided, from the named field.

get_row_data() → Tuple[Any]¶: Get the data for a row, that is one value from each field.

get_shared_state(state_name)¶: Get the shared state mapped to state_name, raising exception if key not present.

has_field(name: str) → bool¶: Returns True if a field with given name is storede here.

randomdataset.schemaparser module¶

Parses the YAML schema used to define datasets. The schema must be a list of dictionary definitions where each defines an object type to instantiate and its constructor keyword arguments. Dictionaries are interpreted as new objects in any place they are used and other values as literals. The top level list must be for creating Dataset instances. Every dictionary must have a typename member stating the fully-qualified name for the type to instantiate, and a name argument to pass to the constructor. For example, to create a single dataset item with a few random fields:

Example:

name: testset typename: randomdataset.Dataset fields: - name: Name

typename: randomdataset.StrFieldGen lmin: 6 lmax: 14
- name: Age typename: randomdataset.IntFieldGen vmin: 18 vmax: 90
- name: is_employed typename: randomdataset.BoolFieldGen

class randomdataset.schemaparser.ConstrSchemaFields(value)¶

Bases: enum.Enum

An enumeration.

NAME = 'name'¶

TYPENAME = 'typename'¶

randomdataset.schemaparser.parse_obj_constr(schema_dict: Dict[str, Union[dict, list, tuple]])¶

Parse and construct an object from the given schema dictionary. The field ConstrSchemaFields.TYPENAME must be in this dictionary, which is keyed to the fully-qualified name of the type to construct. Other fields become keyword arguments in the constructor call. The provided schema must have a key “name” containing the name of the object to create (which will be passed as a constructor argument of the same name), a key “typename” giving the fully qualified type name of the object to create, and then whatever other constructor arguments are to follow. For example, a class can be instantiated from the “__main__” module.

Example:

class CreateTest:

def __init__(self, name, a, b):: self.name = name self.a = a self.b = b

create_dict = {“name”: “test”, “typename”: “__main__.CreateTest”, “a”: 1, “b”: “two”} test = randomdataset.schemaparser.parse_obj_constr(create_dict) print(test.name, test.a, test.b) # prints “name 1 two”

randomdataset.schemaparser.parse_schema(stream_or_file: Union[str, IO]) → List[randomdataset.generators.generator.DataGenerator]¶: Parse the given file or stream and return the list of Dataset objects it specifies.

randomdataset.fields module¶

randomdataset.generator module¶

class randomdataset.generator.DataGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, file_mode: str = 'w', file_ext: str = '')¶