randomdataset package¶
Submodules¶
randomdataset.application module¶
-
randomdataset.application.print_csv_test()¶ Simple test routine which creates a schema, generates a small CSV file, and prints it to stdout.
randomdataset.csvgenerator module¶
-
class
randomdataset.csvgenerator.CSVGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, write_header: bool = True)¶ Bases:
randomdataset.generators.generator.DataGenerator-
write_stream(stream: IO)¶ Write a dataset to the given file-like object.
-
randomdataset.dataset module¶
-
class
randomdataset.dataset.Dataset(name: str, fields: Iterable[randomdataset.fields.fieldgen.FieldGen])¶ Bases:
objectAppend data to the list mapped to state_name, adding the new list if not present.
-
property
field_names¶ Get a tuple of names of the stored fields.
-
property
field_types¶ Get a tuple of types of the stored fields.
-
property
fields¶ Returns a tuple of the stored FieldGen objects.
-
get_field(name: str) → randomdataset.fields.fieldgen.FieldGen¶ Get the field of the given name, raising ValueError if not found.
-
get_field_data(name: str, length: Optional[int] = None) → Any¶ Get a value, array of values if length provided, from the named field.
-
get_row_data() → Tuple[Any]¶ Get the data for a row, that is one value from each field.
Get the shared state mapped to state_name, raising exception if key not present.
-
has_field(name: str) → bool¶ Returns True if a field with given name is storede here.
randomdataset.schemaparser module¶
Parses the YAML schema used to define datasets. The schema must be a list of dictionary definitions where each defines an object type to instantiate and its constructor keyword arguments. Dictionaries are interpreted as new objects in any place they are used and other values as literals. The top level list must be for creating Dataset instances. Every dictionary must have a typename member stating the fully-qualified name for the type to instantiate, and a name argument to pass to the constructor. For example, to create a single dataset item with a few random fields:
- Example:
name: testset typename: randomdataset.Dataset fields: - name: Name
typename: randomdataset.StrFieldGen lmin: 6 lmax: 14
name: Age typename: randomdataset.IntFieldGen vmin: 18 vmax: 90
name: is_employed typename: randomdataset.BoolFieldGen
-
class
randomdataset.schemaparser.ConstrSchemaFields(value)¶ Bases:
enum.EnumAn enumeration.
-
NAME= 'name'¶
-
TYPENAME= 'typename'¶
-
-
randomdataset.schemaparser.parse_obj_constr(schema_dict: Dict[str, Union[dict, list, tuple]])¶ Parse and construct an object from the given schema dictionary. The field ConstrSchemaFields.TYPENAME must be in this dictionary, which is keyed to the fully-qualified name of the type to construct. Other fields become keyword arguments in the constructor call. The provided schema must have a key “name” containing the name of the object to create (which will be passed as a constructor argument of the same name), a key “typename” giving the fully qualified type name of the object to create, and then whatever other constructor arguments are to follow. For example, a class can be instantiated from the “__main__” module.
- Example:
- class CreateTest:
- def __init__(self, name, a, b):
self.name = name self.a = a self.b = b
create_dict = {“name”: “test”, “typename”: “__main__.CreateTest”, “a”: 1, “b”: “two”} test = randomdataset.schemaparser.parse_obj_constr(create_dict) print(test.name, test.a, test.b) # prints “name 1 two”
-
randomdataset.schemaparser.parse_schema(stream_or_file: Union[str, IO]) → List[randomdataset.generators.generator.DataGenerator]¶ Parse the given file or stream and return the list of Dataset objects it specifies.
randomdataset.fields module¶
randomdataset.generator module¶
-
class
randomdataset.generator.DataGenerator(dataset: randomdataset.dataset.Dataset, num_lines: int, file_mode: str = 'w', file_ext: str = '')¶ Bases:
object-
generate_fields(length: int)¶ Yields an of data length long from each field in the dataset.
-
generate_rows()¶ Yields self.num_lines of rows from the dataset.
-
get_header() → Tuple[str]¶ Return the header for a table, default is the tuple of field names.
-
write_file(dest: Union[str, IO])¶ Write a dataset to the given file path or file-like object.
-
write_stream(stream: IO)¶ Write a dataset to the given file-like object.
-
write_to_target(target: Any)¶ Write a dataset to the given target, which can be a string path to a file or directory, or some other object type expected by the override of this method in a subclass.
-