Skip to content

Commit

Permalink
Try this?
Browse files Browse the repository at this point in the history
  • Loading branch information
orf committed Jul 29, 2023
1 parent faa960c commit 1a15d2c
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
10 changes: 9 additions & 1 deletion .github/workflows/unique_python_files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,17 @@ jobs:
mkdir dataset/
cat links/dataset.txt | xargs -P 5 -n 4 wget --no-verbose -P dataset/
- name: Install parallel
run: |
sudo apt-get install parallel
- name: Combine
run: |
poetry run pypi-data run-sql ${{ github.workspace }}/sql/unique_python_files.prql unique-python-files.parquet --db=db --output=parquet --threads=2 dataset/*.parquet
mkdir combined/
find dataset/ -name '*.parquet' | parallel -j 1 --xargs -n2 poetry run pypi-data run-sql ${{ github.workspace }}/sql/unique_python_files.prql --output=parquet --threads=2 combined/{#}.parquet {}
- name: List
run: ls combined/

- name: Gets latest created release info
id: latest_release_info
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ input/*

.idea/
data/
dataset/

### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down

0 comments on commit 1a15d2c

Please sign in to comment.