The Sample Data Generator produces realistic, cohesive, and 100% fictional datasets for use in demonstrations and testing.
The SDG prefers statistically realistic patterns (e.g., a student with poor attendance generally tracks to poor grades, students are the appropriate age for their grade level, students who are English learners have home languages that track to their ethnicity, and so forth). The system is configurable, and can produce arbitrarily large datasets.
While the SDG creates data with realistic patterns, it is randomly generated and must not be used in place of real-world data for scenarios such as training for machine learning or other algorithmic approaches.
For more information, see:
- How to Submit an Issue
- How Submit a Feature Request
- Review on-going development work at SDG Project
The Ed-Fi Alliance welcomes code contributions from the community. Please read the Ed-Fi Contribution Guidelines for detailed information on how to contribute source code.
Looking for an easy way to get started? Search for tickets with label "up-for-grabs" in Tracker; these are nice-to-have but low priority tickets that should not require in-depth knowledge of the code base and architecture.
Copyright (c) 2021 Ed-Fi Alliance, LLC and contributors.
Licensed under the Apache License, Version 2.0 (the "License").
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
See NOTICES for additional copyright and license notifications.