Replies: 15 comments 20 replies
-
Storing data might be problematic, you might need to comply to GDPR.
…On Wed, Jun 1, 2022 at 9:38 AM Christian Muise ***@***.***> wrote:
What are everyone's thoughts on this?
The legacy solver stores just the time it's taken to solve the problem
sent, and no further details. This re-implementation stores everything,
largely because that was the simplest model under the new framework. My
current thought (but not entirely attached to it):
- Retain the PDDL / full call payload by default, reject anything over
a certain size.
- Have an endpoint to wipe the data on a specific call (payload
details (like PDDL) removed, but stats like solve time and endpoint
retained).
- Have an optional flag for all services that changes the retention
strategy to "delete after a day" or "delete after retrieved" or whatever.
- Clearly stipulate on the landing page what is happening with the
data, and why it's being collected.
My thinking on the above stems from a couple of things...(1) it's an open
project that anyone can clone / deploy on their own (and we should make
this as turn-key as possible), which means data is entirely controlled by
them; and (2) it's a service providing a transaction of data for compute.
The data retained is a contribution to the planning community -- to be
released publicly (no IP's, but no scrubbing of PDDL) for analysis in
KEPS-like studies -- and the free (as in $$) service is the exchange in
return. I'm imagining studies on how a domain goes from blank PDDL to
complete working copy, or cross-section analysis of common errors in a
class, or whatever.
Tagging some I know may want to contribute, having worked on some version
of the solver or taught courses that may use it (please feel free to add
anyone else you might think may be interested): @nirlipo
<https://github.com/nirlipo> @jan-dolejsi <https://github.com/jan-dolejsi>
@FlorianPommerening <https://github.com/FlorianPommerening> @miquelramirez
<https://github.com/miquelramirez> @ctpelok77
<https://github.com/ctpelok77>
—
Reply to this email directly, view it on GitHub
<#43>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEALFOKZTR7CCWPKLZR3VM5RV3ANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
… On Wed, Jun 1, 2022 at 11:50 AM Christian Muise ***@***.***> wrote:
Compliance is easy, no? Delete on demand, don't retain if requested via
the API, etc.
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAMHRTGF5LV2LLZ3W2LVM6BCXANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2865238@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
How can you prevent it?
This can be part of the initial state representation for instance.
You have no control over this and that's the real problem.
…On Wed, Jun 1, 2022 at 11:57 AM Christian Muise ***@***.***> wrote:
Also, there's no plan to store "personal data" -- violation would only
come if people start embedding such data in PDDL comments or some such. No
IP's, emails, etc.
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAPZO4Y247VSWWIO64LVM6CAHANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2865302@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
Assuming it's complying with legislation, and as you mention, we provide clear explanation about the purpose of the data gathered and allow for opt-out, then it seems that It can build a good dataset for interesting analysis in the future. Information about the performance of the planners, their logs, and timestamp is quite useful, and doesn't contain any sensitive information. Storing PDDLs can be sensitive, but I don't think we are encrypting the communication either, so storing is not the only weak point. Being an open-source project, it can be deployed in local servers in case the PDDL should be kept private. I'm no expert in the legal&cyber side so I focus mostly about information that can be useful for teaching and research. Besides planner's performance metrics, having PDDLs with their timestamps, and associated run-ID to relate the planner's performance is really useful. I can see how it can be the basis to build automatic feedback mechanisms, and understanding common errors while modeling. Does our current implementation store all the data? As soon as the flower instance is restarted, the related DB is wiped. Where are we storing any info now? This is regarding your comment that the reimplementation of the solver stores everything. |
Beta Was this translation helpful? Give feedback.
-
I would say "no".
…On Fri, Jun 3, 2022 at 11:33 AM Christian Muise ***@***.***> wrote:
Precisely the type of data that would be a goldmine for KEPS-like
questions. Shouldn't be forced, but it could really open the door to model
modelling.
Does our current implementation store all the data? As soon as the flower
instance is restarted, the related DB is wiped. Where are we storing any
info now? This is regarding your comment that the reimplementation of the
solver stores everything.
Apparently, I was mistaken! Because of the flower views, I'd assumed it
was all stored in a persistent DB. If this isn't the case, then this
discussion has now morphed to "do we want to change the retention strategy
to anything persistent?" ;)
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEALOCG4PLFZYC2K74ELVNIQWFANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2879310@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
Before discussing how to comply with GDPR, I claim that we should make
every effort to not need to comply with it. The reason for that is that
complying is not the problematic part, it's the bureaucracy around it, as I
mentioned before. We would need to have people with particular roles, go
through a certain training, etc.
Here is the explanation who needs to comply:
https://www.termsfeed.com/blog/need-comply-gdpr/#:~:text=The%20GDPR%20states%20that%20any,be%20compliant%20with%20the%20GDPR
.
So, if you allow someone to submit information in a PDDL like *(and (name
Malte) (last Helmert))* and you store it, you need to comply with GDPR.
If you like to be able to store PDDLs, there should be other solutions.
There can be multiple options:
1. PDDLs stored elsewhere
2. PDDLs are kept but every string is de-anonymized. Here, the question is
whether we want to be able to restore.
3. ...
…On Mon, Jun 6, 2022 at 9:17 AM Christian Muise ***@***.***> wrote:
I imagine industry for proprietary purposes will go with mode 4 -- on-prem
deployment. It's why IBM hosts their own version of GitHub, rather than use
the primary version. Would indeed be interested on @ctpelok77
<https://github.com/ctpelok77> 's take, since he's dealt with more IBM
lawyers than I have ;).
Some very interesting ideas raised, @miquelramirez
<https://github.com/miquelramirez> . As a form of exchange, opening the
compute more or less is an interesting model. "Your data is the product"
doesn't need to be a bad thing -- just as long as it's explicit and agreed
upon. So low resource on 3 makes sense, medium on 1, and a broader pool of
compute for dedicated contribution to the data front in the case of option
2.
I'd like to make it clear that the data isn't being collected for our
private purposes, but rather to be package up and reflected back to the
research community to explore. I don't think we'd get very far trying to
sell personalized PDDL data to advertisers ;) (can you imagine the ads for
toy blocks and robotic grippers? fun!), but either way we should make this
clear.
Since it was brought up above, GDPR would dictate the functionality to
remove data from the system. So everything store should be tied to the
unique hash built for the result, and we can assume access to that
constitutes ownership -- removal would be a matter of the right DB query to
wipe everything associated with that hash, and we can even implement mode 3
using this functionality.
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAMZ7B3I73UDRQI5ZGTVNX24ZANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2890784@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
@nir Lipovetzky ***@***.***> I am almost certain that an industry
partner would opt out of storing, even just in case.
…On Mon, Jun 6, 2022 at 9:30 AM Michael Katz ***@***.***> wrote:
Before discussing how to comply with GDPR, I claim that we should make
every effort to not need to comply with it. The reason for that is that
complying is not the problematic part, it's the bureaucracy around it, as I
mentioned before. We would need to have people with particular roles, go
through a certain training, etc.
Here is the explanation who needs to comply:
https://www.termsfeed.com/blog/need-comply-gdpr/#:~:text=The%20GDPR%20states%20that%20any,be%20compliant%20with%20the%20GDPR
.
So, if you allow someone to submit information in a PDDL like *(and (name
Malte) (last Helmert))* and you store it, you need to comply with GDPR.
If you like to be able to store PDDLs, there should be other solutions.
There can be multiple options:
1. PDDLs stored elsewhere
2. PDDLs are kept but every string is de-anonymized. Here, the question is
whether we want to be able to restore.
3. ...
On Mon, Jun 6, 2022 at 9:17 AM Christian Muise ***@***.***>
wrote:
> I imagine industry for proprietary purposes will go with mode 4 --
> on-prem deployment. It's why IBM hosts their own version of GitHub, rather
> than use the primary version. Would indeed be interested on @ctpelok77
> <https://github.com/ctpelok77> 's take, since he's dealt with more IBM
> lawyers than I have ;).
>
> Some very interesting ideas raised, @miquelramirez
> <https://github.com/miquelramirez> . As a form of exchange, opening the
> compute more or less is an interesting model. "Your data is the product"
> doesn't need to be a bad thing -- just as long as it's explicit and agreed
> upon. So low resource on 3 makes sense, medium on 1, and a broader pool of
> compute for dedicated contribution to the data front in the case of option
> 2.
>
> I'd like to make it clear that the data isn't being collected for our
> private purposes, but rather to be package up and reflected back to the
> research community to explore. I don't think we'd get very far trying to
> sell personalized PDDL data to advertisers ;) (can you imagine the ads for
> toy blocks and robotic grippers? fun!), but either way we should make this
> clear.
>
> Since it was brought up above, GDPR would dictate the functionality to
> remove data from the system. So everything store should be tied to the
> unique hash built for the result, and we can assume access to that
> constitutes ownership -- removal would be a matter of the right DB query to
> wipe everything associated with that hash, and we can even implement mode 3
> using this functionality.
>
> —
> Reply to this email directly, view it on GitHub
> <#43 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AH4LEAMZ7B3I73UDRQI5ZGTVNX24ZANCNFSM5XRI4G7A>
> .
> You are receiving this because you were mentioned.Message ID:
> <AI-Planning/planning-as-a-service/repo-discussions/43/comments/2890784@
> github.com>
>
--
Cheers,
Michael
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
Hi folks! On meta-data: the trend is seen meta-data as both useful and concerning. The concern is that it might enable identifying people or practices of organizations, revealing details of their operations. So: flexibility and separation of concerns in the software artifacts so it's very clear what's happening. |
Beta Was this translation helpful? Give feedback.
-
Okay. FWIW is worth I reviewed very recently some papers from people worried about someone reverse engineering statistics (like a NN) to identify all kinds of sensitive information (like whether somebody's face is in the dataset). So I have read recently quite a few papers both theoretical and applied that demonstrate the (in general) hopelessness of doing so without some very special information. For those interested, check this and this out, for two didactical, readable examples taken from the literature on the problem posed by the necessity of coming up privacy-preserving statistical databases. To further ground the discussion, check out the MATILDA project at the University of Melbourne. And let's look at a real-world example of what could look the metadata collected, for instance something like this So from meta-data like that, is the one below a reasonable scenario?
I cannot honestly see how this is possible with modes of operation 1, 3 or 4. Mode 2 does leave an opening, and this example you gave suggests the importance of anonymizing (by scrambling) stuff like names of objects, predicates, actions etc. In other words, every stored PDDL should go through a filter that maps every NAME token in the PDDL lexer to some random string of characters. Miquel. |
Beta Was this translation helpful? Give feedback.
-
@miguel Ramírez ***@***.***> this is no joke. If someone
wants to harm you or your organization, they can submit a call with the
personal information of EU citizens in mode 2 (I hope I remembered the mode
correctly) and report you for GDPR violation.
Regardless, the cases when you don't need to comply with GDPR when storing
information that might include personal information of EU citizens are
scarce. Obviously, a valid PDDL could be made to include such information.
Hence, if you allow storing it as is, you need to comply with GDPR.
On the topic of meta-data, there are strict rules in organizations such as
mine on how long you can retain data for and information derived from that
data, who is the data owner, etc. I do not expect people will store data or
derived info (meta-data) on uncontrolled servers.
…On Tue, Jun 7, 2022 at 8:07 AM Hector Palacios ***@***.***> wrote:
Suppose company A use planning for coordinating operations, and meta data
is exposed on instances they solve and how long it took to solve them.
Suppose competitor B is aware of the general business model of A, where A
uses planning –thanks to big PR campaign on using AI–, and B has
information on the context of A. B could notice how A is solving far more
planning problems when there are supply chain issues with some components A
uses, and that such solving more problems is related with a deterioration
of the time it takes A to solve issues with their clients.
So, B could attack A by disrupting further the supply chain by attempting
to buy some of those components while launching a campaign saying they are
faster providing similar services.
Example of A and B could be telco companies, and the components could be
as simple as Ethernet endings for cable. If A were optimizing inventory but
was sensitive to that, they could be subject of an attack.
(This example is inspired by article in the Economist on how highly
optimized operations suffer most due to Covid-related disruption of supply
chain, and how countries got protectionist)
In training I receive about confidential information, they emphasizes I
shall never reveal any issue we might have providing services. Not that I
get aware of that kind of information.
Makes sense?
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAIU5KYR6JYQANZS3YTVN43PZANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2897541@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
@miguel Ramírez ***@***.***> yes, scrambling names would work
wrt GDPR.
…On Tue, Jun 7, 2022 at 8:39 AM Miquel Ramírez ***@***.***> wrote:
Okay. FWIW is worth I reviewed some papers from people worried about
someone reverse engineering statistics (like a NN) to identify all kinds of
sensitive information (like whether somebody's face is in the dataset). So
I have read recently quite a few papers both theoretical and applied that
demonstrate the (in general) hopelessness of doing so without some very
special information.
For those interested, check this
<https://link.springer.com/chapter/10.1007/11681878_14> and [this](Model
Inversion Attacks that Exploit Confidence Information and Basic
Countermeasures) out, for two didactical, readable examples taken from the
literature on the problem posed by the necessity of coming up
privacy-preserving statistical databases.
To further ground the discussion, check out the MATILDA project
<https://acems.org.au/news/matilda-stress-tests-algorithms> at the
University of Melbourne. And let's look at a real-world example of what
could look the metadata collected, for instance something like this
![Plot] <https://acems.org.au/sites/default/files/conv-instance-space.jpg>
So from meta-data like that, is the one below a reasonable scenario?
Suppose competitor B is aware of the general business model of A, where A
uses planning –thanks to big PR campaign on using AI–, and B has
information on the context of A. B could notice how A is solving far more
planning problems when there are supply chain issues with some components A
uses, and that such solving more problems is related with a deterioration
of the time it takes A to solve issues with their clients.
I cannot honestly see how this is possible with modes of operation 1, 3 or
4.
Mode 2 does leave an opening, and this example you gave suggests the
importance of anonymizing (by scrambling) stuff like names of objects,
predicates, actions etc. In other words, every stored PDDL should go
through a filter that maps every NAME token in the PDDL lexer to some
random string of characters.
Miquel.
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAKWLR3IAG3I7RRA73TVN47HZANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2897720@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
Such a great discussion -- thank you all! There is some talking past one another, but not much. What's clear is that any one mode will be problematic, and very restrictive modes will be necessary in order for some to use it.
This statement is either categorically false or true depending on you quantifier for "people". Me? Store and publicize every iteration of PDDL that I write along the journey to a finished PDDL. It's fascinating data, and I'm willing to share it with the world. I suspect others would be as well. Company employees? Not a chance in hell, outside of public tutorial stuff. To recap, these are the 4 proposed modes of operation (1-3, thanks to @miquelramirez ):
Having done my time at IBM, I can almost certainly say that 4 (the most conservative mode) would be the choice within companies. It means your data doesn't even fly out of company networks. Any security expert worth a damn would likely find boat-loads of issues with the public server setup, and recommend against using it for company business. I don't think the thread of someone putting details they shouldn't in their PDDL should thwart mode 2 entirely. We need to be clear what they're giving us if that's what they sign up for, and make it clear how to take the data down (along with having an easy way to do this), but never storing information for fear of what users might include seems a bit overkill. Could we be DDoS'd logistically? Yes, and in a variety of ways. If it becomes an issue, we wipe the DB of all mode 2 from a problematic time-span, and move to a token-only model (you only get mode 2 if you've been trusted and given a token to share -- e.g., an instructor of a class or PI of a lab or researcher or whatever). The biggest outstanding issue is what might go into ToS for what qualifies as "metadata". I'd like to err on the side of caution and not have things that can provide identifiable info, but don't think we need to go so far as worry about the competitive company scenario above -- it's unrealistic since any company with real concerns like that really should not be using the public service. Doing so is at their own risk, and mode 4 is the obvious choice. Even from an efficiency standpoint, it makes no sense to base your operations on a server with limited resources and shared among several multi-hundred student classes doing PDDL assignments ;). Any major objections to the reasoning above? I know there are some decisions in there that will lead to "well company X just isn't going to use the service", and ultimately I think that's fine/expected for models 1-3. Company X really shouldn't be using the current server either since it's controlled by some unknown entity (me) and isn't battle-hardened to protect their PDDL trade secrets. |
Beta Was this translation helpful? Give feedback.
-
@christian Muise ***@***.***> "people" == "people in my
org", since the question was about that.
@christian Muise ***@***.***> I think you are missing the
point of GDPR. The users that give you the data might not even own it. It
does not absolve you from the responsibility of not storing personal data.
Hence the need to scramble all labels.
Can you specify in deployment which modes are permitted for that deployment?
Say I want a deployment in which only mode 3 is allowed.
…On Tue, Jun 7, 2022 at 10:15 AM Christian Muise ***@***.***> wrote:
Such a great discussion -- thank you all!
There is *some* talking past one another, but not much. What's clear is
that any one mode will be problematic, and very restrictive modes will be
necessary in order for some to use it.
I do not expect people will store data or derived info (meta-data) on
uncontrolled servers.
This statement is either categorically false or true depending on you
quantifier for "people".
Me? Store and publicize every iteration of PDDL that I write along the
journey to a finished PDDL. It's fascinating data, and I'm willing to share
it with the world. I suspect others would be as well.
Company employees? Not a chance in hell, outside of public tutorial stuff.
To recap, these are the 4 proposed modes of operation (1-3, thanks to
@miquelramirez <https://github.com/miquelramirez> ):
1. only metadata about the solving process is retained
2. metadata and input are retained
3. nothing is retained, other than what may be temporarily needed for
the operation of the server, security, etc.
4. public server is not used, and on-prem deploy is used instead
(viable since it's FOSS)
Having done my time at IBM, I can almost certainly say that 4 (the most
conservative mode) would be the choice within companies. It means your data
doesn't even fly out of company networks. Any security expert worth a damn
would likely find boat-loads of issues with the public server setup, and
recommend against using it for company business.
I don't think the thread of someone putting details they shouldn't in
their PDDL should thwart mode 2 entirely. We need to be clear what they're
giving us if that's what they sign up for, and make it clear how to take
the data down (along with having an easy way to do this), but never storing
information for fear of what users might include seems a bit overkill.
Could we be DDoS'd logistically? Yes, and in a variety of ways. If it
becomes an issue, we wipe the DB of all mode 2 from a problematic
time-span, and move to a token-only model (you only get mode 2 if you've
been trusted and given a token to share -- e.g., an instructor of a class
or PI of a lab or researcher or whatever).
The biggest outstanding issue is what might go into ToS for what qualifies
as "metadata". I'd like to err on the side of caution and not have things
that can provide identifiable info, but don't think we need to go so far as
worry about the competitive company scenario above -- it's unrealistic
since any company with real concerns like that *really should not be
using the public service*. Doing so is at their own risk, and mode 4 is
the obvious choice. Even from an efficiency standpoint, it makes no sense
to base your operations on a server with limited resources and shared among
several multi-hundred student classes doing PDDL assignments ;).
Any major objections to the reasoning above? I know there are some
decisions in there that will lead to "well company X just isn't going to
use the service", and ultimately I think that's fine/expected for models
1-3. Company X really shouldn't be using the current server either since
it's controlled by some unknown entity (me) and isn't battle-hardened to
protect their PDDL trade secrets.
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAJT3BWZCZAVSVD3AZLVN5KR3ANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2898416@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
That's not a good example (let's put aside the diff between copyright and
personal data protection), github has GDPR officers that went through
proper training. Do you want to do that?
…On Tue, Jun 7, 2022 at 10:41 AM Christian Muise ***@***.***> wrote:
Label scramble can't work on non-parseable PDDL -- and there will be. If
someone submits the full copyrighted works of Game of Thrones as a comment
to this thread, then GitHub is not automagically at fault. The user
violated the copyright, a takedown notice is issued, and the offending data
needs to be scrubbed. If we have that process in place, then it adheres to
the law, no?
Can you specify in deployment which modes are permitted for that
deployment? Say I want a deployment in which only mode 3 is allowed.
I would assume if mode 4 is taken, then whoever deploys can set whatever
default they want (mode 3 or otherwise). There's nothing proprietary about
the server setup. I reckon an on-prem deploy would be modified further so
that it could scale/integrate with their infrastructure. If there's some
in-house need for meta data, then mode 4+1. If it's meant to just be a
remote endpoint generating plans all day, there mode 4+3. But, ultimately,
the mode of operation (defaults, availability, etc) is up to whoever
deploys the server. On-prem means it's the company employee.
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAJ5PQYXWQWAEPPMXXTVN5NQLANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/2898581@
github.com>
--
Cheers,
Michael
|
Beta Was this translation helpful? Give feedback.
-
These are all very good questions, which I don't have an answer to. It is
my understanding that with GDPR, it's better to be safe than sorry.
…On Thu, Jul 7, 2022, 11:07 PM Christian Muise ***@***.***> wrote:
Obviously not ;). So if I create something and host it on Heroku, who's
responsible for the GDPR protocols? What if I openly invite people to
submit their models to a repo on GitHub? What if they email models to me
directly, and I host a zip of them all on my own website?
What I'm trying to peel apart is the subtly that delineates a reasonable
need to adhere to GDPR full-on, versus those cases where it's just not an
issue.
—
Reply to this email directly, view it on GitHub
<#43 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH4LEAOMGGZMBSDXNO3JRE3VS6LOZANCNFSM5XRI4G7A>
.
You are receiving this because you were mentioned.Message ID:
<AI-Planning/planning-as-a-service/repo-discussions/43/comments/3104518@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
What are everyone's thoughts on this?
The legacy solver stores just the time it's taken to solve the problem sent, and no further details. This re-implementation stores everything, largely because that was the simplest model under the new framework. My current thought (but not entirely attached to it):
My thinking on the above stems from a couple of things...(1) it's an open project that anyone can clone / deploy on their own (and we should make this as turn-key as possible), which means data is entirely controlled by them; and (2) it's a service providing a transaction of data for compute. The data retained is a contribution to the planning community -- to be released publicly (no IP's, but no scrubbing of PDDL) for analysis in KEPS-like studies -- and the free (as in $$) service is the exchange in return. I'm imagining studies on how a domain goes from blank PDDL to complete working copy, or cross-section analysis of common errors in a class, or whatever.
Tagging some I know may want to contribute, having worked on some version of the solver or taught courses that may use it (please feel free to add anyone else you might think may be interested): @nirlipo @jan-dolejsi @FlorianPommerening @miquelramirez @ctpelok77
Beta Was this translation helpful? Give feedback.
All reactions