Problem 44541. Arrange the names in alphabetical order (2)

David Verrelli

7 solvers

3 likes

Solve

Solve Later

Arrange the list of names in alphabetical order, following the German standard DIN 5007, Variant 2, §6.1.1.4.2 — for lists of people's names.

Special characters: ä = ae, ö = oe, ü = ue, ß = ss.

The above special characters must be heeded in determining the correct sequence, and retained unaltered in the final output. Other accents would typically be ignored, but are in any case not present in the Test Suite. Hyphens and spaces do not affect the sequence.

Prefixes: Ignore prefixes such as "von", "von der", "vor", "am", "zum". These can, in general, be identified in that they are not capitalised — see example below. Capitalisation (uppercase versus lowercase) must be preserved in your final output.

Sorting should be based on the surname [~family name]. The surname (together with any prefixes!) will always appear first, followed by a comma and then the given name(s) [first name(s)]. In principle, if two surnames were alike, then one would have to next sort by the given name(s) — however, that situation does not arise, and will not arise, in the Test Suite.

Inputs comprise cell arrays of character vectors. The cell arrays can be either row or column vectors. Return your output in the same type of vector.

EXAMPLE:

 % Input
 in = {'Hofmann, Michael' 
       'Hölderlin, Friedrich' 
       'Holz, Arno'
       'van Hoddis, Jakob' 
       'von Hofmannsthal, Hugo'}
 % Output
 out = {'van Hoddis, Jakob' 
       'Hölderlin, Friedrich' 
       'Hofmann, Michael' 
       'von Hofmannsthal, Hugo' 
       'Holz, Arno'}

Solution Stats

21 Solutions

7 Solvers

Last Solution submitted on Nov 27, 2022

Last 200 Solutions

Problem Comments

2 Comments

J-G van der Toorn on 11 Mar 2018

What I am struggling with (for a project) is code that does fuzzy matching. Sometimes, I have lists of names of which some are misspelled by one or more characters, and for some, the first name is listed first, for others last, the intermediates 'van', 'von der', etc. are used or not, before the last name, or separated, or at the end, and still I'd like to match the names to the most likely ones in two lists (given a certain threshold when they are really different).

David Verrelli on 12 Mar 2018

Your project sounds both interesting & tricky. I have created one more problem that may interest you in that respect, namely Problem 44383. I guess it is vaguely like fuzzy matching, but with sentences (rather than names). There was also Problem 93 in the Cody Challenge.

Solution Comments

Show comments

Problem Recent Solvers7

Problem Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!