randomdataset package

Submodules

randomdataset.application module

randomdataset.application.print_csv_test()

Simple test routine which creates a schema, generates a small CSV file, and prints it to stdout.

randomdataset.csvgenerator module

class randomdataset.csvgenerator.CSVGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, write_header: bool = True)

Bases: randomdataset.generators.generator.DataGenerator

write_stream(stream: IO)

Write a dataset to the given file-like object.

randomdataset.dataset module

class randomdataset.dataset.Dataset(name: str, fields: Iterable[randomdataset.fields.fieldgen.FieldGen])

Bases: object

append_shared_state(state_name: str, data: Any)

Append data to the list mapped to state_name, adding the new list if not present.

property field_names

Get a tuple of names of the stored fields.

property field_types

Get a tuple of types of the stored fields.

property fields

Returns a tuple of the stored FieldGen objects.

get_field(name: str)randomdataset.fields.fieldgen.FieldGen

Get the field of the given name, raising ValueError if not found.

get_field_data(name: str, length: Optional[int] = None) → Any

Get a value, array of values if length provided, from the named field.

get_row_data() → Tuple[Any]

Get the data for a row, that is one value from each field.

get_shared_state(state_name)

Get the shared state mapped to state_name, raising exception if key not present.

has_field(name: str) → bool

Returns True if a field with given name is storede here.

randomdataset.schemaparser module

Parses the YAML schema used to define datasets. The schema must be a list of dictionary definitions where each defines an object type to instantiate and its constructor keyword arguments. Dictionaries are interpreted as new objects in any place they are used and other values as literals. The top level list must be for creating Dataset instances. Every dictionary must have a typename member stating the fully-qualified name for the type to instantiate, and a name argument to pass to the constructor. For example, to create a single dataset item with a few random fields:

Example:
  • name: testset typename: randomdataset.Dataset fields: - name: Name

    typename: randomdataset.StrFieldGen lmin: 6 lmax: 14

    • name: Age typename: randomdataset.IntFieldGen vmin: 18 vmax: 90

    • name: is_employed typename: randomdataset.BoolFieldGen

class randomdataset.schemaparser.ConstrSchemaFields(value)

Bases: enum.Enum

An enumeration.

NAME = 'name'
TYPENAME = 'typename'
randomdataset.schemaparser.parse_obj_constr(schema_dict: Dict[str, Union[dict, list, tuple]])

Parse and construct an object from the given schema dictionary. The field ConstrSchemaFields.TYPENAME must be in this dictionary, which is keyed to the fully-qualified name of the type to construct. Other fields become keyword arguments in the constructor call. The provided schema must have a key “name” containing the name of the object to create (which will be passed as a constructor argument of the same name), a key “typename” giving the fully qualified type name of the object to create, and then whatever other constructor arguments are to follow. For example, a class can be instantiated from the “__main__” module.

Example:
class CreateTest:
def __init__(self, name, a, b):

self.name = name self.a = a self.b = b

create_dict = {“name”: “test”, “typename”: “__main__.CreateTest”, “a”: 1, “b”: “two”} test = randomdataset.schemaparser.parse_obj_constr(create_dict) print(test.name, test.a, test.b) # prints “name 1 two”

randomdataset.schemaparser.parse_schema(stream_or_file: Union[str, IO]) → List[randomdataset.generators.generator.DataGenerator]

Parse the given file or stream and return the list of Dataset objects it specifies.

randomdataset.fields module

randomdataset.generator module

class randomdataset.generator.DataGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, file_mode: str = 'w', file_ext: str = '')

Bases: object

generate_fields(length: int)

Yields an of data length long from each field in the dataset.

generate_rows()

Yields self.num_lines of rows from the dataset.

get_header() → Tuple[str]

Return the header for a table, default is the tuple of field names.

write_file(dest: Union[str, IO])

Write a dataset to the given file path or file-like object.

write_stream(stream: IO)

Write a dataset to the given file-like object.

write_to_target(target: Any)

Write a dataset to the given target, which can be a string path to a file or directory, or some other object type expected by the override of this method in a subclass.

randomdataset.utils module

randomdataset.utils.find_type_def(qualified_name: str)

Module contents