-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some performance related changes #14
base: master
Are you sure you want to change the base?
Conversation
Performance related: 1. Caching of Stringbuilder on RecordOperations.cs for "RecordToString" and "RecordValuesToString" methods to lessen a little the burden on the GC (I actually didn't measure anything...) 2. Change to use "CompareOptions.OrdinalIgnoreCase" and "CompareOptions.Ordinal" whenever possible (prety good performance gain whenever we have some heavy string working to do...) 3. Change to use "ToLowerInvariant" instead of "ToLower" for a small performance gain (10%) Access Visibility 1. some access visibility level changes to useful Field properties on "FieldBase", "FixedLengthField" and "DelimitedField" (similar to the previous change on atributes!) Other changes 1. FileHelperAsyncEngine.EOF, it is userful if using this engine using ReadNextRecord () or ReadNexts () (usage in a while loop for example) 2. FileHelperAsyncEngine.Close, add a dispose to the TextWriter just as a precaution...
@@ -874,7 +874,7 @@ public BooleanConverter(string trueStr, string falseStr) | |||
public override object StringToField(string from) | |||
{ | |||
object val; | |||
string testTo = from.ToLower(); | |||
string testTo = from.ToLowerInvariant (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember to get some problems with Invariant, maybe that don't convert Á É etc to lower, maybe we must use EqualsIgnoreCase instead of use ToLower
Excellent contribution, from a fast review more than of the 80% can be applied, the Invariant methods have some trubble so better to merge them in another wave, let me try to margue only some changes Thanks !!!! |
Hi! Glad to hear that you accepted some of the suggestions... I should have
Anyway, the places where the ToLowerInvariant is used should not have this
I suggested it, because I kind of used the fields information generated by Regards, On Wed, Sep 11, 2013 at 3:12 PM, Marcos Meli [email protected]:
|
Great !! Is all fine to send it all, but I need to review later at home. But small batches are better for automerge them :)
engine.Options.Fields.Last().Optional = true; or some declarative changes for example engine.Options.Fields[i].Width = 20; etc We must create some scenarios of run time options and go in that directions. I will like to discuss that !! Thanks |
Maybe to avoid using ClassBuilder in partial trust environment, we could: Simple approach
More complex approach (involves upgrading to .Net 3.5 or maintaining
On Wed, Sep 11, 2013 at 3:28 PM, Marcos Meli [email protected]:
|
Actually the run time creation of a layout without using the classbuilder We just have to suppress the object generation and use the "engine. About the public access of the Field info: I didn't see the Fields internal Regards On Wed, Sep 11, 2013 at 3:45 PM, Khalid Salomão [email protected]:
|
Great feedback About dinamic generation:
I think that we must to try to reinvent FileHelpers, make it all backward compatible that we can, decoupling some things. Structure Definition: Values Assigment: The best option to read generic CSV with FileHelpers (the current way simply dont work) would be to iterate through records and through fields in a enumerable way or with indexers (by index, by fields name) etc foreach(var record in CommonEngine.ReadCsv(..)) What do u think ? |
I like this kind of usage: I actually like the 3.0 way of accessing the "engine.LastRecordValues" to engine.ReadNext (); If you are open to some degree of redesign in the 3.0 release, I think it I would suggest only a small redesign and simplification to allow a fast What do you think of doing a 3.0 release that is somewhat compatible with So I would suggest for the 3.0: . The bunch of small changes and fixes that is already in the todays master Obs.: I tried to access the link For the 4.0, I would consider:
I mainly use the FileHelperAsyncEngine, since I am usually dealing with
On Wed, Sep 11, 2013 at 4:07 PM, Marcos Meli [email protected]:
|
Structure Definition The FileHelpers already uses internally the FieldBase to store the field OR def.AddField ("f1").SetQuoted (""", engine.SetRecodInfo (def); Values Assigment On Wed, Sep 11, 2013 at 5:41 PM, Khalid Salomão [email protected]:
|
I'm back and the issue tracking too :) http://filehelpers.myjetbrains.com/youtrack/rest/agile/FileHelpers-0/sprint/3.0 I used a wrapper of the records to be used with bulk insert, that must have the structure of a DataReader, and make BulkCopy operations really fast in our ETL Tasks About Properties vs Fields, I want to support both of them, but define a policy that by default the library uses Fields, but you can specify just something like [DelimitedRecordUsingProperties(Chars.Tab)] or with a parameter or other name, but allowing the attributes to be applied to fields and properties, and only changing the get and set proxy methods all will work smothlly I like the idea of release a 3.0 with little changes in core, but I'm trying to create a ton of examples of library usage, I get ton of mails about how to use the library in so common escenarios that we need to create a demo showcase. For that I created a examples framework that checks the examples in compilation mode, but show them later in a win forms app, maybe we can create a Silverlight app or something like to avoid exe running Cheers |
Hi Marcos, I see what you mean! But sometimes an ETL process is from one text file to About Properties vs Fields, I policy seems a fine idea. But I would suggest I haven't looked into it yet, but I would suggest to keep it simpler than About the examples, great! I don't know if the silverlight app would add any value to the examples... About the emails, are you thinking in something like linqpad or scriptcs to On Fri, Sep 13, 2013 at 9:08 AM, Marcos Meli [email protected]:
|
Add to ErrorManager an recording limit to avoid possible out-of-memory errors in case of layout errors when dealing with large files. After the threshold is reached, successive errors are ignored.
About the examples, I have already created a reciclable framework for open source apps. Open the FileHelpers solution and go to Examples - Demo folder in the solution and run FileHelpers.Examples The projects uses a T4 template that go through files and convert compilable examples to html parts to be shown, also allow to run the examples and captures de console output Let me know what u think :) |
I will take a look in the weekend! On Fri, Sep 13, 2013 at 2:32 PM, Marcos Meli [email protected]:
|
Very nice! Somethings I thougth about when reading:
Regards! On Fri, Sep 13, 2013 at 5:24 PM, Khalid Salomão [email protected]:
|
Great !! 1, In fact html documentation was a secundary idea, the first idea was to create compilable examples that can be show in categories, and that the examples can be runnable, currently it uses WebBrowser, I use geckoFx at work but deployment is prohibitive too big, what do you think ?
The last and final idea is to provide a library called ExamplesFX (Examples Framework) that will allow any open source developer to write the examples in folder basics and only running a T4 template, it will get an executable with the examples and the html documentation to upload it, more yet, we can give a playground too :) but the problem with playground will be sandbox it. We need to create some google docs to discuss all these items, because is hard in comments here in github lol :) Here are the ExampleFx tickets: http://filehelpers.myjetbrains.com/youtrack/rest/agile/Examples%20Generator-1/sprint/1.0?q=%23Unresolved FileHelpers 3.0 Tickets: http://filehelpers.myjetbrains.com/youtrack/rest/agile/FH-0/sprint/3.0?q=%23Unresolved |
If there are very common scenarios, would it make sense to have a wizard-like approach that stepped through the first few questions and pointed them in the right direction? Easier said than done, certainly, but I know some of the 'optional' things (like attributes) might be less obvious/discoverable for some. Bonus points for an interface that lets them just upload a sample file (csv, tsv, xls, xlsx, whatever) and generates a working console app to parse it into a collection of POCO objects. :) On Sep 14, 2013, at 10:03 PM, khalidsalomao [email protected] wrote:
|
The current wizards is excelent as a tutorial! Maybe integrate it with the I will try to continue this discussion on the GoogleDocs, maybe split into About the ExamplesFX , nice! I also though about cs-script for playground, Regards! On Tue, Sep 17, 2013 at 3:26 PM, James Manning [email protected]:
|
Great !! Lets create and share the docs these days :) About the wizar, I like it, but we need to go to a FileHelpers IDE or something like, that allow to change properties in advace level, maybe like james point, the new project wizard will be the current one or something like, but later u can use advance things or press F5 to test the class :) Open varios projects, is a crazy idea, but will be awesome There is already a sandboxes wow !! I really like that (and run it in the IDE) Cheers and Thanks !! |
Great! FileHelpers IDE!!! On Wed, Sep 18, 2013 at 12:40 PM, Marcos Meli [email protected]:
|
Is just and idea, hard to implement, but we can do a simple app with a toolbar, that opens some fhw files, or something like, and usefull for running some text processing tasks too |
I just merged the changes with the current master and push the resolved conflicts to the branch https://github.com/MarcosMeli/FileHelpers/tree/khalidsalomao-performance Lets review and merge them back to the master 👍 |
I tried roll out some of the reasonable changes that I have already made to the FileHelpers build I use for my projects.
Performance related:
and "RecordValuesToString" methods to lessen a little the burden on the
GC (I actually didn't measure anything...)
"CompareOptions.Ordinal" whenever possible (prety good performance gain
whenever we have some heavy string working to do...)
performance gain (10%)
Access Visibility
"FieldBase", "FixedLengthField" and "DelimitedField" (similar to the
previous change on atributes!)
Other changes
ReadNextRecord () or ReadNexts () (usage in a while loop for example)
a precaution...