-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBX-8562: Command to remove duplicated entries after faulty IBX-5388 fix #410
Conversation
I think it is a bad idea not to merge and release the script in the product. Running this command would be a one-off, so we can make the command hidden : https://symfony.com/doc/current/console/hide_commands.html That will prevent the command from being displayed by |
I didn't know about the It would greatly improve developer experience for people executing the update (no manual steps needed) without polluting the |
The problem with that is that it doesn't scale properly. Command might be hidden, true, but if we wanted for every upgrade to have such command, the DI container would get inflated with commands. We already have a lot of them. We need a streamlined solution that is able to execute such updates via e.g. |
You mention the DI container getting inflated with Commands as the main argument against it - but what is the real cost of it? The container taking miliseconds longer to build? Also we're not discussing every future update here - only this specific case where @Nattfarinn decided that it's better to use a Command than a SQL script. For me the |
I'm talking about the scale. You cannot say that this is only for this case, because it sets precedence. Or rather in this case continues anti-pattern we've used before. We already have a lot of commands that could be wrapped into one streamlined solution. However, if you really want it in the product, it should follow patterns existing in product:
This adds significant amount of work to it which could be better spent on a generic upgrade solution. But yes, still much less work to fix this individually than work on a generic one. For the Official Documentation the solution was fine due to its compactness. So, to sum up, I'd be fine with this command being hidden and in the product, but it still needs to have layer separation in that case. |
Theoretical SQL fix for the issue: PostgreSQL: DELETE FROM ezcontentobject_attribute
WHERE ezcontentobject_attribute.id IN (
SELECT ea_duplicate.id FROM ezcontentobject_attribute AS ea_duplicate
JOIN ezcontentobject_attribute AS ea ON
ea.version = ea_duplicate.version
AND ea.language_id = ea_duplicate.language_id
AND ea.contentclassattribute_id = ea_duplicate.contentclassattribute_id
AND ea.contentobject_id = ea_duplicate.contentobject_id
WHERE ea.id < ea_duplicate.id
) MySQL: DELETE ea_duplicate FROM ezcontentobject_attribute AS ea_duplicate
JOIN ezcontentobject_attribute AS ea ON
ea.version = ea_duplicate.version
AND ea.language_id = ea_duplicate.language_id
AND ea.contentclassattribute_id = ea_duplicate.contentclassattribute_id
AND ea.contentobject_id = ea_duplicate.contentobject_id
WHERE ea.id < ea_duplicate.id (Note: for databases it does not matter if condition is part of JOIN or WHERE) Providing this in case it's valuable, and so it does not get lost in private messages. Yes, I've tested this manually on databases, and SQL is correct (with their respective language differences). It should give the same result as the above command, but I haven't checked as I don't currently have an instance of Ibexa DXP with that particular issue. |
Well, in that case we already have precedence as this is what we have done in the past already. For instance : ibexa/fieldtype-richtext#77 And yes, there we have the gateways, tests etc as you mentioned is needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamwojs we need to make a decision here, either:
- we decide to go with the in-product hidden command and refactor it to have separate gateway layer at least and integration coverage
- we go with a regular upgrade script, assuming the SQL provided a few comments above does the job
- we put this after final review in doc only (there are IMO valid objections in the comments above).
ad.2. Update: In my tests with 10, 100, 1000 duplicates there is no problematic difference in time between command and direct SQL. With 10k duplicates the command takes seconds and direct SQL several minutes. With 100k duplicates I started the direct SQL in the afternoon and interrupted execution in the morning. With the command 100k is still a matter of seconds. (In theory this could be related to individual DB config. But then we would have to assume partners would cover tweaking that.) |
Summing up last meeting. Got following error on a big DB from a live project (14M records):
with stack trace:
|
We have several customers waiting for this fix. Could we get some progress on it? |
bdeae0f
to
46e0e4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue found on provided database has been resolved.
I have not found other issues.
ea5186c
to
b552a54
Compare
CI green before removing temp commit. ✅ |
Quality Gate passedIssues Measures |
Do not merge: TEMP commit inside (ref. 46e0e4a )(commit removed)This PR is not meant to be merged. Code will be moved to documentation once review and QA is done.
Branch base is set to 98b7b50 commit to make QA life a bit easier.
v3.3
Command removes duplicated database entries after faulty IBX-5388 fix (base commit for this branch).
Usage:
Options:
Code is compatible with PHP 7.3+. There should be no need to provide separate command for upper DXP versions (4.6 and above) with code adjusted to newer PHP versions and coding standards.