Databene Benerator Publisher's description
from Volker Bergmann
A framework for creating realistic and valid high-volume test data, used for testing
Databene Benerator was designed to be a framework for creating realistic and valid high-volume test data, used for testing (unit/integration/load) and showcase setup.
Metadata constraints are imported from systems and/or configuration files. Data can be imported from and exported to files and systems, anonymized or generated from scratch. Domain packages provide reusable generators for creating domain-specific data as names and addresses internationalizable in language and region. It is strongly customizable with plugins and configuration options.
Here are some key features of "Databene Benerator":
пї efficient operation:
the minimum requirement for any generation feature is to generate at least one million objects per hour on common development hardware.
benerator can run multithreaded, making efficient use of multi-core systems.
benerator's database access is highly optimized, supporting persistence of several thousand rows per second.
пї domain packages provide for easy localized and regionalized creation of commonly used entities:
address: Street, house number, zip code city name, country, phone number.
person: names, titles, salutations, address.
further domain packages are planned and developed on demand or posibility.
пї data quality assurance
supports single and multi-field constraints e.g. generating consistent values for a person's gender, salutation and first name.
ability to validate generated data: Data will be generated according to the constraints definitions. If the tested application uses secret knowledge for input validation, a custom validator may be plugged in to filter out inadequate data sets, e.g. for validating addresses against a postal database.
пї ease of use for programmers: APIs are provided or planned for the following purposes:
dynamic data creation or access for stress test applications (planned).
command line invocation for continuous integration (planned).
Providing an initial database setup for application deployment (planned).
Providing and ensuring consistent data for unit tests (planned).
пї component based, easily extensible API
Predefined generators provide generation of simple data types, arrays, collection and strings that match regular expressions
extensibility by custom generators: A clear component contract for generators provides for easy implementation of custom generators and clean life cycle and resource management.
internationalization: Generated data can be converted with different formats (like time values) or different languages (like salutations or titles).
region concept: Data can be categorized and grouped hierarchically (e.g. cities of a state, country or continent).
accepts input in multiple formats from multiple sources: Specifying a data model is easy. A multitude of generator mechanisms is provided, like file or database import, regular expressions generators, sample lists, distribution functions and different input formats.
provides output in multiple formats at the same time (planned): Since generated information later may not be retrievable from the target systems (e.g. pin numbers), simultaneous output into multiple databases should be provided (e.g. users into database and csv file). A plugin mechanism for data output should be provided to store data in other systems (e.g. LDAP) or file format (e.g. proprietary formats).
import of complex data (planned): Import of entites (or -better- entity graphs) from databases and files.
offers powerful randomization options and is extendable by custom ones.
supports grouping of data into hierarchical data sets. data sets may overlap and form several parallel types of hierarchy.
пї data generation from scratch
пї import and anonymization of production data: Existing data can be imported and anonymized by overwriting certain attributes with generated data.
пї little dependency to external libraries: For maximum compatibility with the runtime environment, the use of 3rd party tools is avoided where possible.
FreeMarker is not required for operation (unless you relly need to use FreeMarker templates.
commons-logging is required but actually used to increase plattform independence by allowing to plug in to different logging infrastructures.
Support for all major databases:
пї all common SQL data types are supported
пї benerator was tested with and provides examples for
пї Oracle 10g (thin driver)
пї MS SQL Server
пї MySQL 5
пї PostgreSQL 8.2
пї HSQL 1.8
пї Derby 10.3
System Requirements:В· Java 5.0
В· The following SQL types are not supported:
В· API not final
В· Database persistence supports only inserts, no updates of pre-existing or previously persisted data.
В· Sequence concept is not final, yet
Program Release Status: Major Update
Program Install Support: Install and Uninstall