There are occasions where you need to develop a name shortener, where you need to trim long name into shorter name. For starter, one will usually set the name into a long list (either array or equivalent) and then chop it off with the list length.
This works only for ASCII characters, such as A-Z and a-z. However, names across the world has a lot of varieties. Take a look at our king at Ace Combat 7:
Mihaly Dumitru Margareta Corneliu Leopold Blanca Karol Aeon Ignatius Raphael Maria Niketas A. ShilageOr a fish sandwich in Swedish:
RäksmörgåsmackaThose cannot be represented in ASCII representation table but Unicode-8 representation table. Here's an example in Go code:
package mainimport ( "fmt")func trimLength(s string, n int) string { rs := []rune(s) return string(rs[:n])}func main() { s := "Räksmörgåsmacka" // Swedish for shrimp sandwich fmt.Println(s[:11]) // Illegal character on end fmt.Println(trimLength(s, 11))}// Output:// Räksmörg�// RäksmörgåsmHence, we need to tread lightly and don't get underestimated!
One recommended approach is to jump back to Unicode and create a list from there. In some languages, you're required to import the Unicode Locale library Such as C/C++. The Approach is:
1. Jump to Unicode Locale Representation Table2. Chop from name into a list (e.g. array or slice in Go)3. Use the Locale Table to slice the list based on required length.3.1. Length is measured by "runes", where some characters are multi-bytes per character.4. Encode back to name data type.Here is an example implementation for the above algorithm:
// Cotributed by: Chew, Kean Ho (Holloway), Johan Dahl//// main program is about trimming names to 20 characterspackage mainimport ( "fmt" "unicode/utf8")const ( maxLength = 20)type Name struct { first string last string}func (n *Name) Set(firstName string, lastName string) { var rs []rune n.first = firstName if utf8.RuneCountInString(firstName) > maxLength { rs = []rune(firstName) n.first = string(rs[:maxLength]) } n.last = lastName if utf8.RuneCountInString(lastName) > maxLength { rs = []rune(lastName) n.last = string(rs[:maxLength]) }}func (n *Name) GetWithFirst(withFirstName bool) string { if withFirstName { return n.last + ", " + n.first } return n.last}func main() { l := &Name{} l.Set("Dumitru Margareta Ἄγγελος Leopold Blanca Karol Aeon Ignatius Raphael Maria Niketas A. Shilage", "Mihaly") s := l.GetWithFirst(true) fmt.Printf("My name is: %s\n", s)}// Output:// My name is: Mihaly, Dumitru Margareta ἌγThinkeridea Qi Yin later approached the team about his finding related to the current example's performance benchmark. It appears that it took quite a hit for comparison:
return s[:n]}func SubStrC(s string, length int) string { var size, n int for i := 0; i < length && n < len(s); i++ { _, size = utf8.DecodeRuneInString(s[n:]) n += size } b := make([]byte, n) copy(b, s[:n]) return *(*string)(unsafe.Pointer(&b))}var s = "Go语言是Google开发的一种静态强类型、编译型、并发型,并具有垃圾回收功能的编程语言。为了方便搜索和识别,有时会将其称为Golang。"func BenchmarkSubStrA(b *testing.B) { for i := 0; i < b.N; i++ { SubStrA(s, 20) }}func BenchmarkSubStrB(b *testing.B) { for i := 0; i < b.N; i++ { SubStrB(s, 20) }}func BenchmarkSubStrC(b *testing.B) { for i := 0; i < b.N; i++ { SubStrC(s, 20) }}// Benchmark:// goos: darwin// goarch: amd64// BenchmarkSubStrA-8 745708 1624 ns/op 336 B/op 2 allocs/op// BenchmarkSubStrB-8 9568920 122 ns/op 0 B/op 0 allocs/op// BenchmarkSubStrC-8 7274718 157 ns/op 48 B/op 1 allocs/op// PASS ok command-line-arguments 4.782sIt appears that the truncation is best be done using for loop rather than checking one at a time. Hence, by merging Qi Yin's findings into the existing solution, we got the final masterpiece:
// Cotributed by: Chew, Kean Ho (Holloway), Johan Dahl, Qi Yin//// main program is about trimming names to 20 characterspackage mainimport ( "fmt" "unicode/utf8")const ( maxLength = 20)type Name struct { first string last string}func (n *Name) trim(s string, length int) string { var size, x int for i := 0; i < length && x < len(s); i++ { _, size = utf8.DecodeRuneInString(s[x:]) x += size } return s[:x]}func (n *Name) Set(firstName string, lastName string) { n.first = n.trim(firstName, maxLength) n.last = n.trim(lastName, maxLength)}func (n *Name) GetWithFirst(withFirstName bool) string { if withFirstName { return n.last + ", " + n.first } return n.last}func main() { l := &Name{} l.Set("Dumitru Margareta Ἄγγελος Leopold Blanca Karol Aeon Ignatius Raphael Maria Niketas A. Shilage", "Mihaly") s := l.GetWithFirst(true) fmt.Printf("My name is: %s\n", s)}// Output:// My name is: Mihaly, Dumitru Margareta ἌγThus, we're now performing upto speed.
That's all about trimming name into shorter length. Remember, the world is big. One should always be open for everyone!