-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement datasets data model #137
Conversation
adb6771
to
bc5aee9
Compare
}, | ||
db = database, | ||
) { | ||
const deleteResult = await disk.delete(dataset.fileKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deletes the file in disk (Filesystem or S3 or other clouds). Do you think we should do in another way, or is blocking the request fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine for now although it would have been more scalable to delete the file in a job afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try in production with S3 enabled and we can move it to a job
8c0f56c
to
ee4887c
Compare
description, | ||
submitStr, | ||
model, | ||
}: Props<TServerAction>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve generic for destroy modal
'application/vnd.ms-excel', | ||
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', | ||
'application/vnd.oasis.opendocument.spreadsheet', | ||
] | ||
const MAX_SIZE = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3mb only? that's very small, i'd push it to 25 or so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25mb of text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well the theory is that they want to evaluate on a large body of data no?, 3mb doesn't push our infra hard at all and it can be limiting for no reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25 then. But node server will handle it?
We'll have to do this https://nextjs.org/docs/app/api-reference/next-config-js/serverActions#bodysizelimit
Although the best approach would be to direct upload to S3 and then process the file when S3 responds. But we need to change how upload is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll start with 15MB and see if someone complains
return { | ||
data, | ||
mutate, | ||
createFormAction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't use this as it won't be compatible with the json-based server action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need this to submit a multi-part form with attachment. This is already working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Edge case.
@@ -22,15 +24,47 @@ export const createDataset = async ( | |||
} | |||
disk: DiskWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is called disk but it can be other file storage no? like s3 or others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments 👌🏼
ee4887c
to
c026196
Compare
--> statement-breakpoint | ||
CREATE INDEX IF NOT EXISTS "datasets_workspace_idx" ON "latitude"."datasets" USING btree ("workspace_id");--> statement-breakpoint | ||
CREATE INDEX IF NOT EXISTS "datasets_author_idx" ON "latitude"."datasets" USING btree ("author_id");--> statement-breakpoint | ||
CREATE UNIQUE INDEX IF NOT EXISTS "datasets_workspace_id_name_index" ON "latitude"."datasets" USING btree ("workspace_id","name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝️ Added 3 indexes
- Workspace
- Author
- Workspace + Name
I'll add UI validation for workspace + name uniqueness. This was a recommendation from @csansoon . He said it's nice to don't repeat names in datasets and I agree.
What?
☝️ Table for storing data related with uploaded datasets
What?
workspace_id
andauthor_id
.datasets.name
in combination withworkspace_id
[',', '\t', ' ', ';']
. We need to know this in order to read the CSV. Nothing is easy 😂Next TODO