Python Display Filter Query Language (PyDFQL) offers an intuitive and powerful query language, similar to Wireshark's display filter, for working with various data structures and formats, including Python dictionaries, lists, objects, and SQL databases.
-
1.1 Installation
1.2 Initialization
1.3 Filtering Data
To quickly get started follow the steps below:
First, install the package using pip:
pip3 install pydfql
Next, import the necessary module and initialize the appropriate display filter with some data.
In the example below we are initializing the ObjectDisplayFilter
with a list of objects:
from dataclasses import dataclass
from pydfql import ObjectDisplayFilter
@dataclass
class Actor:
name: list
age: dict
gender: str
actors = [
Actor(["Laurence", "Fishburne"], {"born": "1961"}, "male"),
Actor(["Keanu", "Reeves"], {"born": "1964"}, "male"),
Actor(["Joe", "Pantoliano"], {"born": "1951"}, "male"),
Actor(["Carrie-Anne", "Moss"], {"born": "1967"}, "female")
]
df = ObjectDisplayFilter(actors)
Note, that PyDFQL supports various other data sources like Python dictionaries, lists and SQL databases.
Once the display filter is initialized, you can start filtering the data using the display filter query language. For example, let's filter the actors whose birth year is after 1960:
filter_query = "age.born > 1960"
filtered_data = df.filter(filter_query)
print(list(filtered_data))
[
Actor(name=['Laurence', 'Fishburne'], age={'born': '1961'}, gender='male'),
Actor(name=['Keanu', 'Reeves'], age={'born': '1964'}, gender='male'),
Actor(name=['Carrie-Anne', 'Moss'], age={'born': '1967'}, gender='female')
]
You can also use more complex queries to filter the data. For example, let's filter male actors born between 1960 and 1964 whose names end with "e":
filter_query = "gender == male and (age.born > 1960 and age.born < 1965) and name matches .*e$"
filtered_data = df.filter(filter_query)
print(list(filtered_data))
[
Actor(name=['Laurence', 'Fishburne'], age={'born': '1961'}, gender='male')
]
Overall, PyDFQL supports a wide range of features, including:
- Data Sources:
Dictionaries
,Lists
,Objects
,SQL Databases
- Comparison Operators:
==
,!=
,<=
,<
,>=
,>
,~=
,~
,&
- Combining Operators:
and
,or
,xor
,not
- Membership Operators:
in
- Types:
Text
,Number
,Date & Time
,Ethernet-
,IPv4-
,IPv6-Address
- Slicing:
Text
,Ethernet-
,IPv4-
,IPv6-Address
- Functions:
upper
,lower
,len
For a detailed description of the individual features check out the User Guide.
PyDFQL can be applied in many contexts due to its flexible design. It is well-suited for working with various data formats and can be easily integrated into your data analysis workflow. Here are some examples where PyDFQL can be particularly useful:
This project wouldn't be possible without these awesome projects:
- wireshark display filter: Display filter for filtering network packages
- parameterized: Parameterized testing with any Python test framework
- pyparsing: Creating PEG-parsers made easy
- ipranger: Parsing and matching IPv4-addresses
- python-dateutil: Parsing and comparing dates