Hello, A bit of fresh news, I have submitted a bunch of patches that is ready to be tested. The main bug report is bug 24151 (Add a pseudonymization process for patrons and transactions) https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24151 I pushed a remote branch with everything applied in the correct order, on my gitlab repo: https://gitlab.com/joubu/Koha/commits/bug_24151 Cheers, Jonathan Le jeu. 21 nov. 2019 à 17:13, Jonathan Druart <jonathan.druart@bugs.koha-community.org> a écrit :
Hello everybody,
I have been contracted by KohaLa to work on some GDPR requirements. The main idea is to "anonymize" patron's data but letting the library access the transactions' statistics.
I am going to present you what I am planning to implement, in order to collect ideas and answers.
There are the following steps I have in mind: 1. Pseudonymization [1] of patron's data 2. Improve deletion of patron related date (tables statistics, old_reserves, deletedborrowers) 3. Add the ability to remove data that have been pseudonymized
I see 2 ways to achieve point 1: * We create 2 tables, 1 for the patrons, 1 for the transactions. - borrowers_anonymized will contain: hash_id, has_cardnumber, branchcode, creation_date, categorycode, bsort1, bsort2, [borrower_attributes] - transaction_anonymized will contain: hash_id, transaction_type, branchcode, itemnumber, holdingbranch, location, itemcallnumber, itemtype, timestamp
hash_id will be generated using the borrowernumber and a key (that will be stored on the server, path in koha-conf)
Pros: Easier to understand and manipulate as it follows existing structure. We track patron's modifications (this is the most important part) Cons: tech part: new config, a new path have to be created (minor)
* We create only 1 table, (nosql-like). It will contain the same data as previously, without the hash_id
Pros: No new config. Data are never updated and we have the values when the transactions has been processed. Cons: Data are not updated :)
About borrower_attributes, the initial specification asks for 2 attributes defined in a syspref. I think it should be configurable, with a join table (Pro: more flexible, Con: SQL requests more complex)
I think we should have the 2 tables and keep a link between the anonymized_patrons and anonymized_transactions tables.
What do you think? I am going to start the implementation very soon in order to plan an integration early in the 20.05 dev cycle.
Regards, Jonathan