Wim
Well, it was technically made mostly of cpp macros
I think you understand way better than vbextreme who still thinks any compiler is a macro assembler
Daniele°
Because in C it is undefined behaviour
In C 'a' is int for standard, in C++ is char
BinaryByter
as he said
Daniele°
No
Wim
The size of an int is platform dependant in C
Wim
In cpp its unicode based
BinaryByter
mostly
BinaryByter
the standard doesnt define any size for int
BinaryByter
so an implementation of int as 8bit is fine
Daniele°
The size of an int is platform dependant in C
On avr sizeof('a') return 2 not 4
Wim
True but chars are mainly considered unicode
Wim
On avr sizeof('a') return 2 not 4
Here we go again, booting in 10 bytes from a disk on avr☺️
BinaryByter
BinaryByter
😂
Wim
huh?
On modern day systems within cpp
BinaryByter
On modern day systems within cpp
but isnt char insuficcient to store even one unicode character?
Wim
Uc is 16 bits, 1 word
BinaryByter
yea, so you need two chars
Wim
No 2 bytes
BinaryByter
the size of two chars*
Daniele°
True but chars are mainly considered unicode
Char is One byte in C and One byte is minimal addressable value, generally 8bit but in some dsp can be 16bits
Wim
Look at your compiler output
BinaryByter
Char is One byte in C and One byte is minimal addressable value, generally 8bit but in some dsp can be 16bits
one byte is the minimal adressable size? Pretty sure that there are exotic archs with other sizes
BinaryByter
Look at your compiler output
Well it will allocate two bytes, but thats equivalent to two chars
Wim
You'll see a bunch of 0 padded chars
BinaryByter
sure, they are padded
BinaryByter
but if I have two chars or one short, they will both be padded
Wim
That doesn't ring a bell?
BinaryByter
huh?
BinaryByter
not yet 😂
BinaryByter
i dont get what you want to get at though
Wim
Put some kanji in it
BinaryByter
but putting kanji into a padded char, is UB
BinaryByter
becasue you dont know wether its actually padded
BinaryByter
also, in an array it wont be padded
Wim
but putting kanji into a padded char, is UB
Its allowed, it'll change the pad byte as its not padded -> its unicode
Wim
also, in an array it wont be padded
That depends but i agree
BinaryByter
Its allowed, it'll change the pad byte as its not padded -> its unicode
its allowed but undefined behaviour since you don't know FOR SURE that its padded
BinaryByter
not every compiler will pad
BinaryByter
thats why i'd use a short for them instead
BinaryByter
also, thats why i'd not say that calling a char a UC is quite right
Wim
True and true which is why many osses screw up on it
BinaryByter
😂
BinaryByter
this is quite sad though
BinaryByter
because i think that most devs wouldnt do this error
Wim
Yes
BinaryByter
so the FEW devs who DO make this error, are on linux 😂
BinaryByter
says a lot lol
Wim
Ms has Ascii and Wide variants, in the base most will pad while other functions drop back to ascii
Wim
Its a huge screwup in total
BinaryByter
almost ever kernel thats popular is a screwup
Ariana
UC characters are quite useful in exploitation, even in web areas where there are polymorphic uc characters some uc characters are harmless by itself but become harmful when taken as ascii(which something thats written in like cpp likely would) uc characters are pain
BinaryByter
😂
BinaryByter
ansi without kanji. NICE :D
Ariana
不是要把,只需要该我们的程序
Wim
You see early CPP moving to unicode, others that copy from it seeing 'padding' and replicating that, then both upgrading to arrays and stuff and all of the sudden moving into full ascii again
BinaryByter
不是要把,只需要该我们的程序
actually, lets move back to ascii
Ariana
XD
Wim
Its true though, compare them; you'll laugh your ass off
BinaryByter
😂
BinaryByter
i already laugh my ass off when I look into the linux source code
Wim
But then having shit like Ascii and Wide versions of functions because they lost track
BinaryByter
and I laugh my ass off twice when some indian asks me how to start in kernel dev on linux
Wim
Or seeing others just padding because others seem to do it too, without thinking about extended unicode characters ..
Wim
I bet most never tried doing extended unicode chars when copying other compilers behaviour
Wim
'Hey they output 0041 .. I should do that too'
BinaryByter
'Hey they output 0041 .. I should do that too'
"hey the emit pms .. I should do that too"
BinaryByter
😂
Wim
same league
Wim
😂
Wim
Even worse as its not at the interpretation but at the codegen side