Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

making the option --limit in pull usable. #165

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
47 changes: 24 additions & 23 deletions lieer/gmailieer.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,6 @@ def main (self):
parser_pull.add_argument ('-t', '--list-labels', action='store_true', default = False,
help = 'list all remote labels (pull)')

parser_pull.add_argument ('--limit', type = int, default = None,
help = 'Maximum number of messages to pull (soft limit, GMail may return more), note that this may upset the tally of synchronized messages.')


parser_pull.add_argument ('-d', '--dry-run', action='store_true',
default = False, help = 'do not make any changes')

Expand All @@ -76,9 +72,6 @@ def main (self):
description = 'push',
help = 'push local tag-changes')

parser_push.add_argument ('--limit', type = int, default = None,
help = 'Maximum number of messages to push, note that this may upset the tally of synchronized messages.')

parser_push.add_argument ('-d', '--dry-run', action='store_true',
default = False, help = 'do not make any changes')

Expand Down Expand Up @@ -119,9 +112,6 @@ def main (self):
description = 'sync',
help = 'sync changes (flags have same meaning as for push and pull)')

parser_sync.add_argument ('--limit', type = int, default = None,
help = 'Maximum number of messages to sync, note that this may upset the tally of synchronized messages.')

parser_sync.add_argument ('-d', '--dry-run', action='store_true',
default = False, help = 'do not make any changes')

Expand Down Expand Up @@ -193,7 +183,8 @@ def main (self):
help = 'Remove messages that have been deleted on the remote (default is on)')
parser_set.add_argument ('--no-remove-local-messages', action = 'store_true', default = False,
help = 'Do not remove messages that have been deleted on the remote')

parser_set.add_argument ('--limit', type = int, default = None,
help = 'Maximum number of messages to pull (soft limit, GMail may return more), note that this may upset the tally of synchronized messages.')
parser_set.set_defaults (func = self.set)


Expand Down Expand Up @@ -275,7 +266,6 @@ def setup (self, args, dry_run = False, load = False, block = False):
def sync (self, args):
self.setup (args, args.dry_run, True)
self.force = args.force
self.limit = args.limit
self.list_labels = False

self.remote.get_labels ()
Expand All @@ -293,7 +283,6 @@ def push (self, args, setup = False):
self.setup (args, args.dry_run, True)

self.force = args.force
self.limit = args.limit

self.remote.get_labels ()

Expand All @@ -313,8 +302,8 @@ def push (self, args, setup = False):
query = notmuch.Query (db, qry)

messages = list(query.search_messages ())
if self.limit is not None and len(messages) > self.limit:
messages = messages[:self.limit]
if self.local.config.limit is not None and len(messages) > self.local.config.limit:
messages = messages[:self.local.config.limit]

# get gids and filter out messages outside this repository
messages, gids = self.local.messages_to_gids (messages)
Expand Down Expand Up @@ -344,8 +333,8 @@ def _got_msgs (ms):
actions = [ a for a in actions if a ]

# limit
if self.limit is not None and len(actions) >= self.limit:
actions = actions[:self.limit]
if self.local.config.limit is not None and len(actions) >= self.local.config.limit:
Copy link
Owner

@gauteh gauteh Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>= ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not checked this part of the code actually. This does not seem necessary now anyways, if the number of messages itself is limited.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actions is not the same as number of messages. i think if this PR is going to be considered for merging it needs to be thoroughly tested, over a longer time, and shown to work well. many users rely on lieer to work correctly, and we have to make sure to not break setups or cause data loss. it also needs to be tested with different limits and also with no limit. I do not guarantee that it will be merged in the end.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the >= comment refers to mismatching use of sometimes > and sometimes >=.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I did go through the mismatching operations. I did not change any of them and it is your call as they remain the same as in the original repository. On that note in the push function, I do not think the if loops to limit the messages and actions are necessary now. They can be removed. Should I do that in the next commit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actions is not the same as number of messages. i think if this PR is going to be considered for merging it needs to be thoroughly tested, over a longer time, and shown to work well. many users rely on lieer to work correctly, and we have to make sure to not break setups or cause data loss. it also needs to be tested with different limits and also with no limit. I do not guarantee that it will be merged in the end.

And, yes I will test it for the said scenarios to see if it works well.

actions = actions[:self.local.config.limit]

# push changes
if len(actions) > 0:
Expand Down Expand Up @@ -386,7 +375,6 @@ def pull (self, args, setup = False):

self.list_labels = args.list_labels
self.force = args.force
self.limit = args.limit

self.remote.get_labels () # to make sure label map is initialized

Expand Down Expand Up @@ -422,7 +410,7 @@ def partial_pull (self):

self.bar_update (len(hist))

if self.limit is not None and len(history) >= self.limit:
if self.local.config.limit is not None and len(history) >= self.local.config.limit:
break

except googleapiclient.errors.HttpError as excep:
Expand Down Expand Up @@ -551,6 +539,17 @@ def remove_from_list (lst, m):

changed = True

#limiting the number of messages in the database to local.config.limit parameter
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not have the same order as the list from gmail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, that is the least we can do to make sure that the oldest messages are discarded to maintain the limit. One way might be to chose a different way to sort other than NEWEST_FIRST.

Copy link
Author

@Lattitude75 Lattitude75 Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, now I see the problem. Maybe, we can go for an oldest thread. That way we only delete a set of messages which did not show up in the front in quite sometime.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if there is a reliable way to make notmuch sort messages the same way as gmail does.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it does come pretty close. The only warning the users should face is that the threads at the very end might be incomplete because of the message limit. But, what we can do is to make sure that threads remain complete and thus remove the old threads as a whole until the sum of the messages is below the limit. What do you prefer for this limit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the recent commit, I decided to keep the threads intact and thus deciding on the number of messages to be deleted based on threads. And, I also added a progress bar to indicate that older messages are being removed.

with notmuch.Database (mode = notmuch.Database.MODE.READ_WRITE) as db:
query = notmuch.Query(db,'')
query.set_sort(notmuch.Query.SORT.NEWEST_FIRST)
msglist = list(query.search_messages())
if len(msglist) > self.local.config.limit:
delete_list = self.local.nm_messages_to_gids(msglist[self.local.config.limit:])
for m in delete_list:
self.local.remove (m,db)
changed = True

if len (labels_changed) > 0:
lchanged = 0
with notmuch.Database (mode = notmuch.Database.MODE.READ_WRITE) as db:
Expand Down Expand Up @@ -588,7 +587,7 @@ def full_pull (self):
# simple metadata like message ids.
message_gids = []
last_id = self.remote.get_current_history_id (self.local.state.last_historyId)

print(self.local.config.limit)
Lattitude75 marked this conversation as resolved.
Show resolved Hide resolved
for mset in self.remote.all_messages ():
(total, gids) = mset

Expand All @@ -598,14 +597,12 @@ def full_pull (self):
for m in gids:
message_gids.append (m['id'])

if self.limit is not None and len(message_gids) >= self.limit:
if self.local.config.limit is not None and len(message_gids) >= self.local.config.limit:
break

self.bar_close ()

if self.local.config.remove_local_messages:
if self.limit and not self.dry_run:
raise ValueError('--limit with "remove_local_messages" will cause lots of messages to be deleted')

# removing files that have been deleted remotely
all_remote = set (message_gids)
Expand Down Expand Up @@ -785,6 +782,9 @@ def set (self, args):
if args.remove_local_messages:
self.local.config.set_remove_local_messages (True)

if args.limit is not None:
Lattitude75 marked this conversation as resolved.
Show resolved Hide resolved
self.local.config.set_limit (args.limit)

if args.no_remove_local_messages:
self.local.config.set_remove_local_messages (False)

Expand All @@ -802,6 +802,7 @@ def set (self, args):
print ("historyId .........: %d" % self.local.state.last_historyId)
print ("lastmod ...........: %d" % self.local.state.lastmod)
print ("Timeout ...........: %f" % self.local.config.timeout)
print ("Limit .............: %d" % self.local.config.limit)
print ("File extension ....: %s" % self.local.config.file_extension)
print ("Remove local messages .....:", self.local.config.remove_local_messages)
print ("Drop non existing labels...:", self.local.config.drop_non_existing_label)
Expand Down
24 changes: 24 additions & 0 deletions lieer/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ def __init__ (self, config_f):
self.ignore_tags = set(self.json.get ('ignore_tags', []))
self.ignore_remote_labels = set(self.json.get ('ignore_remote_labels', Remote.DEFAULT_IGNORE_LABELS))
self.file_extension = self.json.get ('file_extension', '')
self.limit = self.json.get ('limit', None)

def write (self):
self.json = {}
Expand All @@ -115,6 +116,7 @@ def write (self):
self.json['ignore_remote_labels'] = list(self.ignore_remote_labels)
self.json['remove_local_messages'] = self.remove_local_messages
self.json['file_extension'] = self.file_extension
self.json['limit'] = self.limit

if os.path.exists (self.config_f):
shutil.copyfile (self.config_f, self.config_f + '.bak')
Expand Down Expand Up @@ -174,6 +176,10 @@ def set_file_extension (self, t):
print ("Failed creating test file with file extension: " + t + ", not set.")
raise

def set_limit (self,l):
self.limit = l
self.write()


class State:
# last historyid of last synchronized message, anything that has happened
Expand Down Expand Up @@ -413,6 +419,24 @@ def messages_to_gids (self, msgs):

return (messages, gids)

def nm_messages_to_gids (self, msgs):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the existing function, or share code in some way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, using the existing function might affect the performance if the number of new messages is large. Then, making a new array of messages is completely unnecessary and can add to the processing load. Should I still resort to code directly than to make a new function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have still used the function that I have added, let me know if you want me to include the code directly in gmailieer.py instead.

"""
Gets GIDs from a list of NotmuchMessages, the returned list of tuples may contain
the same NotmuchMessage several times for each matching file. Files outside the
repository are filtered out.
"""
gids = []

for m in msgs:
for fname in m.get_filenames ():
if self.contains (fname):
# get gmail id
gid = self.__filename_to_gid__ (os.path.basename (fname))
if gid:
gids.append (gid)

return gids

def __filename_to_gid__ (self, fname):
ext = ''
if self.config.file_extension:
Expand Down