You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From the first PoC with El Pitazo, looks like we can get:
title, content, date, author, categories and tags.
It would be good to explore on our next sources whether or not they can be extracted also.
In the mean time, for a VP we are counting on just the content of the post. In case of a change on design, it will be notified here.
From the first PoC with El Pitazo, looks like we can get:
title, content, date, author, categories and tags.
It would be good to explore on our next sources whether or not they can be extracted also.
In the mean time, for a VP we are counting on just the content of the post. In case of a change on design, it will be notified here.
Problem
We currently haven't defined the flattened dataset's schema that will be consumed by the
huggingface
transformer.Proposed Solution
Define the training dataset schema that will be used to train the
huggingface
transformer.text
,news_title
,location
,issue
,source_type
,author
, etc...varchar
,int
,float
, etc..)Deliverable
readme.md
with dataset's schema.The text was updated successfully, but these errors were encountered: