[Koha] GDPR - Statistics and anonymization
Jonathan Druart
jonathan.druart at bugs.koha-community.org
Wed Nov 27 09:21:48 NZDT 2019
Hello Michal,
My next steps are the 3 ones I listed in my first post.
I have already started to implement it and will open the bug reports
in the next couple of days.
Thanks for your interest :)
Cheers,
Jonathan
Please note that Koha is the only ILS known to be powerful ;-)
Le ven. 22 nov. 2019 à 21:06, Mike D. <black23 at gmail.com> a écrit :
>
> Hello Jonathan,
> I understand. You're right. So, can we document plan for new GDPR improvements on wiki? Next steps?
>
> Michal
>
> pá 22. 11. 2019 v 16:53 odesílatel Jonathan Druart <jonathan.druart at bugs.koha-community.org> napsal:
>>
>> The anonymized_borrowers table will not contain all data from
>> borrowers. Only a small part.
>> Also we want to track down the transactions when they happen, it will
>> be much more easier to implement and less error prone. How do you want
>> to retrieve the transactions for a given patron when they are
>> anonymized?
>>
>> Also you want to collect statistics only on this table, it will be
>> impossible to make statistics on this table for "anonymized" patrons,
>> and another table for the ones that are not anonymized yet.
>>
>> Does it make sense?
>>
>> Le ven. 22 nov. 2019 à 16:33, Mike D. <black23 at gmail.com> a écrit :
>> >
>> > Hello Jonathan,
>> > sorry "we want to see them as results from borrower searches" shoub be "we want to hide them (anonymized borrower records) from all results from searches"
>> >
>> > Really we need to copy every patron into anonymized_borrowers at time of creation of their records? I think that must be done only in case that we want to anonymize borrower. Not before this day. Because if we will copy all translation into new table we'll maka database bigger wi same data. Do I understand it clear?
>> >
>> > Michal
>> >
>> > pá 22. 11. 2019 v 16:08 odesílatel Jonathan Druart <jonathan.druart at bugs.koha-community.org> napsal:
>> >>
>> >> Michal,
>> >>
>> >> Redirection the discussion back to the list.
>> >>
>> >> What do you mean by "we want to see them as results from borrower searches"?
>> >>
>> >> What I understood is that libraries want to collect statistics on
>> >> transactions even when the patron's records have been deleted.
>> >> So:
>> >> 1/ Create a patron (it's copied to borrowers and anonymized_borrowers)
>> >> 2/ Do some transactions (it's in statistics and anonymized_transactions)
>> >> 3/ Update a patron (it's updated in borrowers and
>> >> anonymized_borrowers, if option 1 is picked)
>> >> 4/ Delete the patron (moved from borrowers to deletedborrowers)
>> >> 6/ Clean/purge the deletedborrowers
>> >> => anonymized_* tables are still there, statistics still possible
>> >>
>> >> 7/ When needed, given parameters, anonymized_* entries will be deleted.
>> >>
>> >> Le ven. 22 nov. 2019 à 15:51, Mike D. <black23 at gmail.com> a écrit :
>> >> >
>> >> > Hello,
>> >> > I think that thinks are maybe less complicated. I really like Your idea with hash. But if we anonymize (replace by some generated hash or something similar) name, adress, telephone, e-mai and adress, noticess, we don't need related information about borrowers. Because we cut the rope between data and person. Anonymized borrower records we can set as "anonymized" and store them separately from "live" borrowers, because we want to see them as results from borrower searches. What do You think about this?
>> >> >
>> >> > Michal
>> >> >
>> >> > pá 22. 11. 2019 v 15:41 odesílatel Jonathan Druart <jonathan.druart at bugs.koha-community.org> napsal:
>> >> >>
>> >> >> 2 tables vs 1 table :)
>> >> >>
>> >> >> Le ven. 22 nov. 2019 à 15:28, Mike D. <black23 at gmail.com> a écrit :
>> >> >> >
>> >> >> > Hello Jonathan,
>> >> >> > what options? I miss a point :-)
>> >> >> >
>> >> >> > Michal
>> >> >> >
>> >> >> > pá 22. 11. 2019 v 15:17 odesílatel Jonathan Druart <jonathan.druart at bugs.koha-community.org> napsal:
>> >> >> >>
>> >> >> >> Thanks for the help Michal,
>> >> >> >> What about the two options I have?
>> >> >> >>
>> >> >> >> Le jeu. 21 nov. 2019 à 17:58, Mike D. <black23 at gmail.com> a écrit :
>> >> >> >> >
>> >> >> >> > Hi Jonathan,
>> >> >> >> > I’m volunteer for debate about processes and anon tools and methods. I’m ready to be tester of bugs. Koha is GDPR ready but some points could be improved for easier everyday usage in libraries. Because if something is clear and easy everybody do it without fear and stress.
>> >> >> >> >
>> >> >> >> > Thank You
>> >> >> >> >
>> >> >> >> > Michal
>> >> >> >> >
>> >> >> >> > čt 21. 11. 2019 v 17:14 odesílatel Jonathan Druart <jonathan.druart at bugs.koha-community.org> napsal:
>> >> >> >> >>
>> >> >> >> >> Hello everybody,
>> >> >> >> >>
>> >> >> >> >> I have been contracted by KohaLa to work on some GDPR requirements.
>> >> >> >> >> The main idea is to "anonymize" patron's data but letting the library
>> >> >> >> >> access the transactions' statistics.
>> >> >> >> >>
>> >> >> >> >> I am going to present you what I am planning to implement, in order to
>> >> >> >> >> collect ideas and answers.
>> >> >> >> >>
>> >> >> >> >> There are the following steps I have in mind:
>> >> >> >> >> 1. Pseudonymization [1] of patron's data
>> >> >> >> >> 2. Improve deletion of patron related date (tables statistics,
>> >> >> >> >> old_reserves, deletedborrowers)
>> >> >> >> >> 3. Add the ability to remove data that have been pseudonymized
>> >> >> >> >>
>> >> >> >> >> I see 2 ways to achieve point 1:
>> >> >> >> >> * We create 2 tables, 1 for the patrons, 1 for the transactions.
>> >> >> >> >> - borrowers_anonymized will contain: hash_id, has_cardnumber,
>> >> >> >> >> branchcode, creation_date, categorycode, bsort1, bsort2,
>> >> >> >> >> [borrower_attributes]
>> >> >> >> >> - transaction_anonymized will contain: hash_id, transaction_type,
>> >> >> >> >> branchcode, itemnumber, holdingbranch, location, itemcallnumber,
>> >> >> >> >> itemtype, timestamp
>> >> >> >> >>
>> >> >> >> >> hash_id will be generated using the borrowernumber and a key (that
>> >> >> >> >> will be stored on the server, path in koha-conf)
>> >> >> >> >>
>> >> >> >> >> Pros: Easier to understand and manipulate as it follows existing structure.
>> >> >> >> >> We track patron's modifications (this is the most important part)
>> >> >> >> >> Cons: tech part: new config, a new path have to be created (minor)
>> >> >> >> >>
>> >> >> >> >> * We create only 1 table, (nosql-like). It will contain the same data
>> >> >> >> >> as previously, without the hash_id
>> >> >> >> >>
>> >> >> >> >> Pros: No new config. Data are never updated and we have the values
>> >> >> >> >> when the transactions has been processed.
>> >> >> >> >> Cons: Data are not updated :)
>> >> >> >> >>
>> >> >> >> >> About borrower_attributes, the initial specification asks for 2
>> >> >> >> >> attributes defined in a syspref. I think it should be configurable,
>> >> >> >> >> with a join table (Pro: more flexible, Con: SQL requests more complex)
>> >> >> >> >>
>> >> >> >> >> I think we should have the 2 tables and keep a link between the
>> >> >> >> >> anonymized_patrons and anonymized_transactions tables.
>> >> >> >> >>
>> >> >> >> >> What do you think?
>> >> >> >> >> I am going to start the implementation very soon in order to plan an
>> >> >> >> >> integration early in the 20.05 dev cycle.
>> >> >> >> >>
>> >> >> >> >> Regards,
>> >> >> >> >> Jonathan
>> >> >> >> >>
>> >> >> >> >> [1] https://en.wikipedia.org/wiki/Pseudonymization
>> >> >> >> >> _______________________________________________
>> >> >> >> >> Koha mailing list http://koha-community.org
>> >> >> >> >> Koha at lists.katipo.co.nz
>> >> >> >> >> https://lists.katipo.co.nz/mailman/listinfo/koha
>> >> _______________________________________________
>> >> Koha mailing list http://koha-community.org
>> >> Koha at lists.katipo.co.nz
>> >> https://lists.katipo.co.nz/mailman/listinfo/koha
More information about the Koha
mailing list