Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've used a number of pdf libraries in python and C# over the years, none have worked reliably as needed (that's just pdf I guess), but haven't used pdfplumber, I'll be sure to give it a go, thanks for the suggestion.

Yes, additional metadata. Totally understand it adds in a lot of complexity but could help for fine-tuning an LLM.

With regards to dates, not a lawyer, but for Federal I would go with "Start Date", it's always the day following the End Date of the previous comp. The Date of Assent (well the year at least) is in the title, but also the first start date. The registration date can be either before or after the start date depending. [1][2]

The tricky part is when sections have different commencement dates that are detailed in the text. I don't know anywhere that is easily accessible. And, if you think about it, usually the most important information for say businesses being regulated.

I wouldn't worry with timezone per say, it's relative to each particular state.[3] i.e. why polling closes in a federal election at 6pm in each state rather than coordinated with ACT.

[1] Section 12 of the Legislation Act 2003 https://www.legislation.gov.au/Details/C2023C00213

[2] Sections 4 Acts Interpretation Act 1901 https://www.legislation.gov.au/Details/C2023C00213

[3] Sections 37 Acts Interpretation Act 1901 https://www.legislation.gov.au/Details/C2023C00213



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: