Keitaro84
17th November 2022, 15:52
So I'm new to LN and I don't know all the constructs so I don't know if I'm doing it right.

I need to iterate over each character in a string but when I try to retrieve a special character it breaks.

string my.string (100)
long str.lenght

my.string = "aèiòu"
str.lenght = len(my.string)

| I was expecting a length of 5 but it is 11
for i = 1 to (str.lenght)
string characther
| If I try to get a special character like è or ò it fails
characther = my.string(i;i)
| TODO CODE
endfor


Is there a way to get every single character without causing that error?

Thank you

srkndnsn
18th November 2022, 09:25
Hi,

The error gives nothing, hard to understand anything but the special characters are multibyte(not 1, it is 2 byte) characters.

Best regards
Serkan

Keitaro84
18th November 2022, 11:19
I think I have found a solution, I share it in case it is useful to others.

string my.string (100)

my.string = "¼àè°òù§][*+ç£ç;.<abcd"

long arr.size
string arr.char(1,1) based

| populate the character array
for i = 1 to (len(strip$(my.string)))
long info, j, k
| if it returns 0 the character can be taken. Going into
| debug I noticed that before each special character there is a
| yes this type: -1 then invalid 188(¼) again -1 and then
| the special character (e.g. è). In case of the character ¼ la
| sequence is: -1 ¼ -1 ¼.
info = mb.char.info(my.string(i;1))
if(info eq 0) then
j = asc(my.string(i;1))
| if ¼ increase k because in case of special character ¼
| I need to be able to put it in the array
if(j eq 188) then
k = k + 1
endif
| check that it is different from ¼ or equal to ¼ ma
| repeated twice
if(j <> 188 or (j eq 188 and (k \ 2 = 0))) then
arr.size = arr.size + 1
alloc.mem(arr.char, 9, arr.size)
arr.char(1,arr.size) = my.string(i;1)
| I reset the counter by ¼
k = 0
endif
endif
endfor

free.mem(arr.char)

mark_h
18th November 2022, 14:08
I was wondering if it had to do with mb characters. I never had to mess with them - what I did with special characters was a little search and replace. But what I had to deal with was really simple. Thanks for posting the solution.

avpatil
18th November 2022, 15:23
You are doing
characther = my.string(i;i)
I think you should do
characther = my.string(i;1)

Keitaro84
21st November 2022, 09:31
Yes, I noticed that

characther = my.string(i;i)

it was wrong and had to be fixed in

characther = my.string(i;1)


but as srkndnsn says a special character is a multibyte so it errs if you try to get it with my.string(i;1)

andreas.toepper
23rd November 2022, 08:47
That’s because you did not declare my.strings as multibyte. I think if you do use

string my.string (100) mb

it should work.
But you can not use based with multibyte strings ("You cannot use multibyte strings in combination with the BASED mechanism"; quoted from the reference Guide, section "Multibyte strings").

So you still will need to modify your script.

Working with a German LN setup, I sometimes need to filter German umlauts from strings. I will use something like this:

function domain tcmcs.str256m stripumlaute(domain tcmcs.str256m i.sern)
{
long i, did.it
domain tcmcs.str256m new.sern

new.sern = ""
for i = 1 to len(i.sern)
did.it = false
if i.sern(i;1) = "ß" then
new.sern = new.sern & "ss"
did.it = true
endif
if i.sern(i;1) = "Ä" then
new.sern = new.sern & "AE"
did.it = true
endif
if i.sern(i;1) = "Ü" then
new.sern = new.sern & "UE"
did.it = true
endif
if i.sern(i;1) = "Ö" then
new.sern = new.sern & "OE"
did.it = true
endif
if i.sern(i;1) = "ä" then
new.sern = new.sern & "ae"
did.it = true
endif
if i.sern(i;1) = "ü" then
new.sern = new.sern & "ue"
did.it = true
endif
if i.sern(i;1) = "ö" then
new.sern = new.sern & "oe"
did.it = true
endif
if not did.it then
new.sern = new.sern & i.sern(i;1)
endif
endfor

return(new.sern)
}

It's a simple character compare, but the characters itself are multibyte characters. (I prefer using multibyte string domains.)

Keitaro84
23rd November 2022, 12:42
Thanks a lot your suggestion was really helpful, I rewrote the function

function void array.characters(ref string arr(,), domain tcmcs.str999m str)
{
long i

for i = 1 to (len(strip$(str)))
long info
domain tcmcs.str1m character

character = str(i;1)
info = mb.char.info(character)

if(info eq 0 and character <> "") then
array.add(arr, character)
endif
endfor
}


I used it in this test function which does an alternating uppercase of a string

function void alternate.uppercase()
{
long i
string my.string (100), result (100)

my.string = "àbcdèfghìlmnòpqrstùvz"
result = ""

string arr.char(1,1) based
| populate the character array
array.characters(arr.char, my.string)

| alternate uppercase
for i = 1 to array.size(arr.char)
result = result & (i\2 <> 0 ? toupper$(trim$(arr.char(1,i))) : trim$(arr.char(1,i)))
endfor

free.mem(arr.char)
}

function void array.add(ref string arr(,), string value(999))
{
long size, lng
long nr.dims
long dim.array(4)

array.info(arr, nr.dims, dim.array)
size = dim.array(2)
lng = len.in.bytes(value)

alloc.mem(arr, lng > dim.array(1) ? lng : dim.array(1), size +1)
arr(1, size + 1) = value
}

function long array.size(ref string arr(,))
{
long nr.dims
long dim.array(4)

array.info(arr, nr.dims, dim.array)

return(dim.array(2))
}