forked from chaoss/grimoirelab-sortinghat
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
725 lines (508 loc) · 25.7 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
# Releases
## sortinghat 0.20.0 - (2024-02-19)
**New features:**
* Organization aliases (#857)\
Organizations can be known by different names. To avoid duplicates,
organizations can have aliases. Searching for an organization using
one of its aliases returns the organization. When an organization is
merged into another, its name becomes an alias of the target
organization. If a name exists as an alias, no organization can be
created with that name and viceversa. An organization's aliases can be
added and deleted both on the organizations table and the single
organization view.
## sortinghat 0.19.2 - (2024-02-08)
* Update Poetry's package dependencies
## sortinghat 0.19.1 - (2024-02-01)
**Bug fixes:**
* Fix "Table 'django_session' doesn't exist" error\
Fixes the "Table 'django_session' doesn't exist" error for new
installs. For existing databases, run the following commands to create
the table: ``` django-admin migrate --fake sessions zero django-admin
migrate ```
## sortinghat 0.19.0 - (2024-01-30)
**New features:**
* Unify identities with same source\
Include a new option to only recommend or unify identities from
trusted sources like GitHub or GitLab that have the same username and
backend.
**Bug fixes:**
* Use correct base URL for login and change password API calls (#851)\
The URLs called to login and change password now use the public path
found in vue.config.js if no API URL is specified.
* Authentication required fixed\
When the AUTHENTICATION_REQUIRED setting is set to False, any query to
the API is allowed.
* Display individual's most recent organization\
The individual's current affiliation is now the most recent one
instead of the oldest.
* CSRF token is only required on web requests\
The GraphQL API required the 'X-CSRFToken' header, but the token could
only be retrieved by making a GET request. Now, requests authenticated
using JWT don't need to provide the CSRF token and only the user
interface, which is vulnerable to CSRF attacks and uses a different
authentication, requires it.
**Performance improvements:**
* Performance of organizations query\
Improve organization query for the table by avoiding individual
queries.
## sortinghat 0.18.0 - (2023-12-19)
**New features:**
* Link to profile in individual cards (#837)\
The name on the individuals cards now links to the member's profile.
* Open calendar to the side of the date input (#838)\
The date picker calendar that is used to edit affiliation dates now
opens to the right side of the text field to avoid covering it.
* Improved readability of job settings\
The options for the "unify" and "recommend matches" jobs are now
displayed in a clearer way.
* Improved loading time when looking for organizations\
The autocomplete field that is used to affiliate individuals to
organizations now makes fewer and lighter requests to find them,
resulting in faster loading times.
**Performance improvements:**
* Performance on affiliation recommendations improved\
We have improved the affiliation performance by one order of magnitude
removing unnecessary queries to the database.
## sortinghat 0.17.0 - (2023-11-28)
**New features:**
* Gitdm identities importer\
New SortingHat identities importer for Gitdm format. This backend is
configured with three parameters: a URL pointing to the file that
matches emails with organizations, an optional URL for an aliases file
that associates emails, and a flag for email validation to verify the
validity of the provided email addresses.
## sortinghat 0.16.0 - (2023-11-17)
**Bug fixes:**
* Fix individual page not loading\
The individual's view was not loading when the workspace had not been
used before or the cache was cleared.
**Performance improvements:**
* Recommendations performance improved\
Improve the recommendations performance by reducing the number of
queries to the database and only generating recommendations between
individuals that are directly related.
## sortinghat 0.15.0 - (2023-11-03)
**New features:**
* Recommendations for individuals modified after a given date (#813)\
Users can generate merge and affiliation recommendations for
individuals that have been created or modified after a date specified
with the `last_modified` parameter.
* Add individual to workspace from their profile page (#816)\
A new button on the individual's profile page allows users to save the
identity in the workspace to continue working with it later on the
dashboard.
* Cache individuals table data (#821)\
Using cached queries prevents the table from refetching all the data
from the server everytime any information is edited. This is
particularly helpful if there is a huge number of identities, where
reloading the table is very slow. However, there are some cases when
the queries need to be refetched, eg. when identities are merged or
split.
## sortinghat 0.14.0 - (2023-10-20)
**New features:**
* Strict criteria for merge recommendations (#812)\
The merge recommendations filter out invalid email adresses and names
that don't have at least a first and last name when looking for
matches. To disable this behavior, set the `strict` parameter on
`recommendMatches` or `unify` to `false`.
* Text field to update enrollment dates (#819)\
Users have the option to enter the dates on a text field when editing
affiliations.
* Improved organization selector (#820)\
The organization selector that is used to affiliate individuals now
has the option to create an organization if the desired one is not
found. Its size is also increased to improve the readability of longer
names.
* API method to create a scheduled task\
The `add_scheduled_task` API method adds a new scheduled task to the
registry.
* Manage app settings from the user interface\
Users can configure automatic affiliations, profile unification and
identity data synchronization from the new `Settings` section on the
user interface.
**Bug fixes:**
* Remove tasks that fail to be scheduled\
When there was an issue with the Redis connection when a task was
created, the task was added to the database but there was not
scheduled job linked to it. Tasks are now removed from the database
and an error is raised in this case.
**Dependencies updateds:**
* Add Python 3.9 and drop 3.7 support\
Python 3.7 reached the end of life phase on June 27 2023 and is no
longer supported.
## sortinghat 0.13.0 - (2023-08-06)
**Bug fixes:**
* Sub-domain affiliation error (#805)\
The `affiliate` and `recommend affiliations` jobs no longer recommend
matches based on a domain's sub-domains if it is not marked as
`top_domain`.
**New deprecations:**
* Use the task scheduler to import identities\
Manage periodic tasks to import identities with the `scheduleTask`,
`updateScheduledTask` and `deleteScheduledTask` GraphQL mutations. The
tasks that were already scheduled using the `addImportIdentitiesTask`
mutation are kept when the migrations are applied.
## sortinghat 0.12.0 - (2023-07-23)
**New features:**
* Job scheduler\
This new feature allows users to schedule jobs, such as `affiliate` or
`unify`, to run periodically. The tasks can be configured, updated and
deleted using the GraphQL API.
## sortinghat 0.11.1 - (2023-07-11)
**Bug fixes:**
* Show an organization's members\
Repeatedly clicking on the button to see the members of an
organization or team on the table sometimes showed the full
individuals list.
## sortinghat 0.11.0 - (2023-06-28)
**New features:**
* Merge organizations (#571)\
Merging organizations automatically moves all the domains, teams and
enrollments to the target organization. This is helpful in case an
organization has duplicates or if an organization absorbs another one.
Organizations can be merged using drag and drop on the user interface.
* Recommendations by individual (#779)\
Users can generate matching recommendations for a specific individual
by clicking on the drop down menu on each individual or on the
individual's profile.
**Bug fixes:**
* Show hidden buttons when the mouse is over the table row (#787)\
The buttons to lock an individual or mark it as a bot were only
visible when the mouse wass over the individual's name, which made it
hard to find them. Now they appear when the mouse is over the table
row.
* Email affiliation error (#793)\
Fix an error when the email domain ends with a dot, causing the
affiliation process to stop.
* ADD button doesn't affiliate individuals to organizations\
Affiliating an individual to an organization using the "+ ADD" button
on the table expanded view failed.
* Enrollment filter on organizations view\
Filtering individuals by their affiliation to an organization also
returned results of organizations that contained that name. The filter
now only returns organizations that match the exact name.
## sortinghat 0.10.0 - (2023-05-17)
**New features:**
* Show when tables are loading (#772)\
The individuals and organizations tables now show a progress bar to
indicate that the items are loading.
* Organization profiles\
Each organization's full profile is available by clicking its name on
the table or at `/organization/<organization name>`. This view shows
the organization's teams, members and domains.
**Bug fixes:**
* Sort jobs from newest to oldest (#769)\
The jobs page now sorts the list from newest to oldest to show running
jobs first.
* Unreadable large numbers in pagination (#770)\
Large page numbers were not fully visible in the tables pagination.
* Edit a profile name with the pencil button (#773)\
Clicking on an invidual's name no longer activates the edition mode.
The name can now be edited with the pencil button.
* Fix enrollment in individual's profile\
In the individual's profile, the button to add an organization was not
working.
* Job timeouts\
Jobs failed because they exceeded the default timeout while running
tasks involving numerous identities. To ensure successful completion,
we adjusted the timeout setting to an infinite duration, allowing jobs
to finish without interruptions.
**Breaking changes:**
* Multi-tenancy using headers\
Tenants are now selected using the `sortinghat-tenant` header instead
of the host. Proxies and clients using multi-tenancy should include
that header.
**Performance improvements:**
* Performance improved for recommendations and merging jobs\
The performance of the matching and merging algorithms that are used
on these jobs have been considerably improved. These jobs will be 4
times faster than on the previous version.
* uWSGI threads and workers\
Include two new environments variables to define the number of threads
and workers for uWSGI. These new variables are
`SORTINGHAT_UWSGI_WORKERS` and `SORTINGHAT_UWSGI_THREADS`
* SortingHat database performance\
Improve SortingHat performance when there are a lot of individuals in
the database.
**Dependencies updateds:**
* Update dependencies\
Include google-auth as a dependency to fix release issues.
## sortinghat 0.9.3 - (2023-04-28)
**Bug fixes:**
* Tenant selection in job fixed\
Tenant selection raised an error when the job context was defined as
keyword argument.
## sortinghat 0.9.2 - (2023-04-27)
**Bug fixes:**
* Static files not included in wheel package\
SortingHat static files were not included in the Python package. The
problem was in the GitHub action.
## sortinghat 0.9.1 - (2023-04-26)
**Bug fixes:**
* Static files not included in wheel package\
SortingHat static files were not included in the Python package.
## sortinghat 0.9.0 - (2023-04-21)
**New features:**
* Set top domain from UI (#729)\
Add the option to set an organization's domain as top domain from the
UI.
* Order individuals by indentities (#732)\
Adds the option to order the individuals by the number of identities
they have.
* Import identities automatically (#746)\
Create a schema to import identities to SortingHat automatically using
custom backends. The jobs will be executed periodically, at the given
interval. The tasks can be configured using the GraphQL API. To
create a custom importer you need to extend `IdentitiesImporter`,
define a `NAME` for your importer (that will be used in the UI), and
implement `get_identities` method that returns a list of individuals
with the related identities that will be imported into SortingHat. If
your importer requires extra parameters, you must extend the
`__init__` method with the required parameters. Those parameters can
be defined using the API.
* Create account command\
Include a new command to create users in SortingHat. The command can
be executed as `sortinghat-admin create-user`.
* Drag and drop to enroll in teams\
Expanding an organization on the table now shows the full list of
teams. Individuals can be dragged and dropped into a team and
viceversa to affiliate them. The buttons to add, edit and delete
organization and team information are reorganized into a dropdown menu
to simplify the interface.
* Multi-tenancy mode\
SortingHat allows hosting multiple instances with a single service
having each instance's data isolated in different databases. To enable
this feature follow these guidelines: - Set `MULTI_TENANT` settings to
`True`. - Define the tenants in `sortinghat/config/tenants.json`. -
Assign users to tenants with `sortinghat-admin set-user-tenant`
command.
* Verify SSL option for client\
Include an option for the client to verify if the certificate is
valid. By default it is verified.
**Bug fixes:**
* Fix outdated recommendation count (#733)\
The number of remaining recommendations on the UI was wrong each time
a recommendation was applied or dismissed.
* Fix search syntax link (#735)\
Fixes the link to the search syntax page on the search bar.
**Feature removals:**
* Groups table removed from the UI\
Groups and organizations are very similar, and having both tables in
the dashboard can be confusing to users. To simplify the view, the
table is removed from the user interface, but groups remain available
through the API.
## sortinghat 0.8.1 - (2023-02-03)
* Update Poetry's package dependencies
## sortinghat 0.8.0 - (2023-02-01)
**New features:**
* Migration command for SortingHat 0.7 (#726)\
Include a new command to migrate SortingHat 0.7 database schema to
0.8. The command can be executed as `sortinghat-admin migrate-old-
database`. It will migrate all the data from the old version.
**Bug fixes:**
* GraphQL client headers updated\
SortingHat client headers are updated adding `Referer` and `Host` to
fix the CSRF token issue.
**Breaking changes:**
* SortingHat as a service\
SortingHat started as a command line tool but, after some years, we
saw its potential and we decided to create a new version of it. Now,
it works as an individual service. This new version provides a new
GraphQL API to operate with the server and a UI web-based app, that
replaces Hatstall, the old UI for SortingHat. Moreover, the new
version adds some features requested long time ago, such as
groups/teams management, recommendations of affiliations and
individuals, or a totally renwed user interface.
## sortinghat 0.7.23 - (2022-11-07)
* Update Poetry's package dependencies
## sortinghat 0.7.22 - (2022-10-31)
* Update Poetry's package dependencies
## sortinghat 0.7.21 - (2022-09-26)
**Others:**
* Update package dependencies\
Update jinja2 package and dev-dependencies.
## sortinghat 0.7.21-rc.6 - (2022-09-26)
**Others:**
* Update package dependencies\
Update jinja2 package and dev-dependencies.
## sortinghat 0.7.21-rc.5 - (2022-09-26)
**Others:**
* Update package dependencies\
Update jinja2 package and dev-dependencies.
## sortinghat 0.7.21-rc.4 - (2022-09-26)
**Others:**
* Update package dependencies\
Update jinja2 package and dev-dependencies.
## Sorting Hat 0.7.20 - (2022-06-02)
**Bug fixes:**
* [gitdm] Skip invalid format lines\
Gitdm parser won't fail reading files with an invalid format. Instead,
it will ignore invalid content.
## Sorting Hat 0.7 - (2018-10-02)
**NOTICE: Database schema generated by SortingHat < 0.7.0 is still
compatible but older versions can have problems inserting UTF-8
characters of 4 bytes.
Python 2.7 is no longer supported.
Please check "Compatibility between versions" section from README.md file.
**
** New features and improvements: **
* Python 2.7 not longer supported
As Python 2.x will not be maintained after 2020, SortingHat is only
compatible with Python >= 3.4.
* Low level API
This API is able to execute basic operations over the database, such
as adding or removing identities or finding entities. All these operations
work within a session. Nothing is stored in the database until the
session is closed. Thus, these functions can be considered as "bricks",
that combined can create high-level functions.
* Storage of UTF-8 4-bytes characters
The default charset of UTF-8 (utf8) in MySQL/MariaDB does not support,
even when they are part of the standard, 4-bytes long characters.
This means characters like emojis or certain chinese characters cannot
be inserted. Usually, identities names or usernames have these types of
characters.
The charset that fully supports UTF-8 is `utf8mb4` using the collation
`utf8mb4_unicode_520_ci`. This collation implements the suggested Unicode
Collation Algorithm (v5.2).
Using `utf8mb4` also implies that the maximum size of char (VARCHAR and
so on) columns is 191. Indexes cannot be larger than that when using
InnoDB engine.
Starting on 0.7 series, SortingHat is using this charset.
* Handle disconnection using pessimistic mode
SQLAlchemy offers a pessimistic mode to handle database disconnection.
Setting `pool_pre_ping` parameter on the database engine will check if
the database connection is still active when a session of the connection
pool is reused. This causes a small hit in the performance but it's worth
it.
* Use a optimistic approach when inserting data
With this optimistic approach, no more queries to check whether an entity
exists on the database are run prior to its insertion.
## Sorting Hat 0.6 - (2018-03-05)
**NOTICE: Database schema generated by SortingHat < 0.6.0 are no longer
compatible. Please check "Compatibility between versions" section from
README.md file**
** New features and improvements: **
* Gender.
Unique identities gender can be set in the profile using the command
`profile` and data will be stored in the table of the same name. This table
adds two new fields: `gender`, a free text field to set the gender
value, and `gender_acc`, to set the accuracy of the gender - in a range
of 1 to 100 - when it is set using automatic options.
The new command `autogender` has also been added. It assigns a gender
to each unique identity using the name of the profile and the information
provided by `http://genderize.io`. Possible values are *male* or *female*.
* Option for reusing a database.
An existing database can be reused when `init` command is called. So far,
when the database was already created, this command raised an exception.
* Version option.
Calling `sortinghat` with the option `-v | --version` prints the version
of `sortinghat` running on the system.
* Tests improvements.
Some minor changes were done in the testing area. The main ones were to
support MariaDB engine and to use a remote testing database.
## Sorting Hat 0.5 - (2017-12-21)
**NOTICE: Database schema generated by SortingHat < 0.5.0 are no longer
compatible. Please check "Compatibility between versions" section from
README.md file**
** New features and improvements: **
* Last modification.
Unique identities and identities log the last time they were modified
by adding, deleting, moving, merging, updating the profile, adding
or removing enrollments operations.
The new `search_last_modified_identities` API function allows to search
for the UUIDs of those identities modified on or after a given date.
* No strict matching option.
This option allows to avoid a rigorous validation of values while
matching identities, for instance, with well formed email addresses
or names with first name and last name. This option is available on
`load` and `unify` commands.
* Reset option while loading.
Before loading any data, if `reset` option is set, all the relationships
between identities and their enrollments will be removed from the
database.
* GrimoireLab support.
GrimoireLab identities and organizations YAML files can be converted
to Sorting Hat JSON format using the script `grimoirelab2sh`.
** Bugs fixed: **
* Fix tables created with invalid collation. In some random situations
Sorting Hat tables appear with an invalid collation. This is related
to a wrong generation of the DDL table statement by SQLAlchemy, which
may randomly prepend the collation information (`MYSQL_COLLATE`) to
the charset one (`MYSQL_CHARSET`), causing the former to be ignored.
Changing `MYSQL_CHARSET` to `MYSQL_DEFAULT_CHARSET` fixed the problem.
* Remove trailing whitespaces in exported JSON files. This error is only
found in Python 2.7 due to a bug in the standard library with
`json.dump()` and `indent` parameter. (#103)
* Update profile information when loading identities. So far, profile
information was set only the first time a unique identity was loaded.
With this change, it will be updated always, except when the given
profile is empty
## Sorting Hat 0.4 - (2017-07-17)
** New features and improvements: **
* Mailmap and StackAlytics support.
Mailmap and StackAlytics files can be converted to Sorting Hat JSON
format using the new scripts `mailmap2sh` and `stackalytics2sh`.
* Unify by sources.
Giving a list of sources, this option allows to `unify` command to
merge only those unique identities which belong to any of the given
sources.
** Bugs fixed: **
* Encoding error generating UUIDs in Python 3. Some special characters
cannot be encoded in Python3. This caused function `uuid()` to fail
when converting those characters. 'surrogateescape' handler was
added to fix that problem.
* Force `utf8_unicode_ci` collation on MySQL tables to fix integrity errors.
MySQL considers chars like `β` and `b` or `ı` and `i` the same, when
some collation values are set (i.e `utf8_general_ci`). This can raise
integrity errors when Sorting Hat tries to add similar identities with
these pairs of characters.
For instance, if the identity:
('scm', 'βart', '[email protected]', 'bart)
is stored in the database, the insertion of:
('scm', 'bart', '[email protected]', 'bart)
will raise an error, even when these identities have different UUIDs.
Forcing MySQL to use `utf8_unicode_ci` fixes this error, allowing
to insert both identities.
## Sorting Hat 0.3 - (2017-03-21)
**NOTICE: UUIDs generated by SortingHat < 0.3.0 are no longer compatible.
Please check "Compatibility between versions" section from README.md file**
** New features and improvements: **
* New algorithm to genere UUIDs.
UUIDs were generated using case and accent sensitive values with the seed
`(source:email:name:username)`. This means that for any identity with the
same values in lower or upper case (i.e: `[email protected]` and `[email protected]`)
or with the same values accent or unaccent (i.e: `John Smith` or `Jöhn Smith`)
would have different UUIDs for any of these combinations.
The new algorithm changes upper to lower case characters and converts accent
characters to their canonical form before the UUIDs is generated.
This change is caused by the behaviour of MySQL with case configurations
and accent and unaccent characters. MySQL considers those characters the same,
raising `IntegrityError` exceptions when similar tuple values are inserted
into the database. Generating the same UUID for these cases will prevent the
error.
Take into account that previous UUIDs are no longer compatible with this
version of SortingHat. You should regenerate the UUIDs following the steps
described in section *Compatibility between versions* from `README.md` file.
** Bugs fixed: **
* Any non-empty value in email field was used during the affiliation. This
caused some errors for non valid email addresses like with 'email@' cases,
which raised a `IndexError` exception. This bug has been fixed using valid
email addresses only during the affiliation.
* Invalid database names were allowed in `init` command.
## Sorting Hat 0.2 - (2017-02-01)
** New features and improvements: **
* Auto complete profile information with `autoprofile` command.
This command autocompletes the profiles information related to a set of unique
identities. To update the profile, the command uses a list of sources ordered
by priority. Only those unique identities which have one or more identities
from any of these sources will be updated. The name of the profile will be
filled using the best name possible, normally the longest one.
* GiHub identities matching method.
This new method tries to find equal identities using those identities from
GitHub sources. The identities must come from a source starting with a `github`
label and the usernames must be equal.
** Bugs fixed: **
* The parser for Gitdm files only accepted email addresses as valid aliases.
This has been modified to accept any type of aliases. Thus, the input file
passed to `gidm2sh` script will be a list of valid aliases instead of email
aliases.