Delimited (CSV)¶
This module supports generation of delimited data. Delimiter will default to ,
To generate completely random data
from genie_pkg import delimited_genie
colspecs =
[
('special_string', 13),
('int', 15),
('float', 7), ('float', 8), ('str', 9),
('str', 5), ('int', 5), ('email', 15, 'mail.com'),
('date', '%d/%m/%Y', 3), ('cc_mastercard',),
('cc_visacard', 13), ('geo_coord', (40.84, -73.87,),)
]
nrows = 10
encoding = 'windows-1252'
for d in delimited_genie.generate(colspecs, nrows, encoding, delimiter='|'):
do_something(d)
If you want to just anonymise some parts of your csv data (Say to remove piis etc)
from genie_pkg import delimited_genie
input_encoding = 'utf-8'
row = 'FReNG,£Ni,£iFthtR¥ubOswUPh,mQWJoypv,F¢MFcR'.encode(input_encoding)
anonymous_col_specs = [(1, 'int'), (4, 'float')]
anonymised = delimited_genie.anonymise_columns(row,
anonymous_col_specs, encoding=input_encoding)
do_something(anonymised)
CSV supports below types
- float (If number of decimal places are not passed, it will default to 2).
- int
- str
- special_string (With special characters)
- email (length, optional_domain) if domain is not passed,
- it will default to dummy.com. (Length is inclusive of domain you specify.)
- date (Make sure format is valid python datetime format.
- Optional delta days to go back or in future)
- geo_coord (center:tuple(lat,lng), radius_in_meters, accuracy_digits) center defaults to melbourne, radius = 10000 and accuracy_digits=3
- cc_mastercard (generates 16 digit valid mastercard number)
- cc_visacard (length) (length can be 16 or 13) (generates 16 digit by default)
- one_of (list of choices) returns random choice from the list passed.
- Useful for things like product codes, country code etc.