⚛ Hz
⚛ Hz
like sqlite3
⚛ Hz
https://sqlite.org/appfileformat.html
⚛ Hz
But as a beginner, it is understandable to implement a crappy database of your own
Anonymous
Anonymous
Anonymous
i was just saying that they won't impact performance because not every write will be an actual disk write
⚛ Hz
(unless your program be killed accidently
⚛ Hz
* you may want to write to a temporary file, and swap them after close
Anonymous
Anonymous
Anonymous
i gave an example for a word processor. see if you can do something similar where the file can be divided into pages where you keep the current page on RAM in structs and only put it on the ofstream when the user moves to a different page.
+
+
btw thank you for trying to help me
Anonymous
⚛ Hz
⚛ Hz
you can reserve space if you don't want to let it allocate memory when appending
⚛ Hz
https://en.cppreference.com/w/cpp/string/basic_string/reserve
⚛ Hz
and if you have to read the whole file, you don't need seek any more
Anonymous
⚛ Hz
tip: memory mapped file cannot be easily extended
⚛ Hz
it has to be fixed size
⚛ Hz
no, it won't work for mmap for file / MapViewOfFile
the only way to resize is destory old map, resize file, and re-map
⚛ Hz
Especially the situation here is the whole file, which means you can't use mremap
⚛ Hz
* a portable way is repeat & and >> to split into individual bytes
(cast to byte array may cause byte order problem)
⚛ Hz
or simply
uint32_t value;
char result[4];
result[0] = (value & 0x000000ff);
result[1] = (value & 0x0000ff00) >> 8;
result[2] = (value & 0x00ff0000) >> 16;
result[3] = (value & 0xff000000) >> 24;
Anonymous
⚛ Hz
wait, does it has any difference as use std::string ??
⚛ Hz
hmmm, I don't think that this issue needs to be considered....The original question did not even mention to use it in a different thread.
and it is not mt-safe when you want to resize the shared memory
⚛ Hz
but you can use std::string::reserve to reserve space
+
Anonymous
⚛ Hz
it has no difference any way (according to the c++ memory model, dynamically allocated memory can be accessed by multiple threads by default)
Anonymous
⚛ Hz
unless you decide to override the global memory allocator, all memory allocated by stl container should be global accessable
⚛ Hz
you can always use pre-allocated string as general byte buffer,
use it as
std::string buffer;
std::mutex mtx;
and only lock it when need to resize(operator[], c_str() and data() should be safe since the space has already be reserved, so no additional allocation when resizing, that's mean the address won't change)
But it sounds crazy to modify the size in multiple threads, how can you ensure that other data structures have not been destroyed before this?
anyway it should not unsafer than using raw system provided function like mmap, since it need read the whole documention to use them correctly
⚛ Hz
1. Create Anonymous mapping with required size (may be greater than the size of the file to account for edits)
2. Read file into this memory
3. Edit the memory
4. Write changes into new file.
This should be the right way for multiple reasons
A) It is more efficient because you directly interface with the OS and you know what it is doing rather than using an STL data structure which does so many things behind your back.
B) It can be made thread safe much easily when compared to strings. I am not sure how you are going to enforce the lock the mutex when calling any functions that require a resize except for to call lock on any non const operation on the string which makes it even more inefficient. With memory mapped regions, you can easily set up a guard region at the end of the mapped memory region and only when writes flow into this region, lock a mutex and call remap with MAY_MOVE option which will ensure only the pointers into the region will be invalidated but offsets will remain valid even after the move. Typically with file operations you will mostly be working with offsets which makes it easier.
C) strings are supposed to represent characters and not bytes of memory and hence must not be exploited for such purposes.
D) With memory mapped region any attempt to write beyond the memory region can be tracked and made to fault (with additional system calls) but with strings it may silently lead to Undefined Behavior if you do so
(if you consider ub, all system interface are ub in standard c++)
yes, it is downside to string/mutex thanks to c++ don't have rust-like ownship system, it's all about the programmer's responsibility
and the char is not byte, yes, this is a historical issue.
The defects of using the system interface are more serious in my opinion: it is not portable. it is not even about linux / windows, how about macos? did you know about the all differences between macos and linux? (example: nginx misuse the SO_REUSEPORT/SO_REUSEADDR so it not woring in macos in some version)
after all, I don't think such a simple task need to use a complex and platform-dependent solution, it maybe a good program, but it is definitely a bad design choose... do you really need to write to a single file in multiple threads? If this is a serious demand, then there are definitely more problems need to consider
+
⚛ Hz
⚛ Hz
cut it to uint16 is trivial
Anonymous
(if you consider ub, all system interface are ub in standard c++)
yes, it is downside to string/mutex thanks to c++ don't have rust-like ownship system, it's all about the programmer's responsibility
and the char is not byte, yes, this is a historical issue.
The defects of using the system interface are more serious in my opinion: it is not portable. it is not even about linux / windows, how about macos? did you know about the all differences between macos and linux? (example: nginx misuse the SO_REUSEPORT/SO_REUSEADDR so it not woring in macos in some version)
after all, I don't think such a simple task need to use a complex and platform-dependent solution, it maybe a good program, but it is definitely a bad design choose... do you really need to write to a single file in multiple threads? If this is a serious demand, then there are definitely more problems need to consider
If you are going to bitch about portability when he didnt even mention that he wants his application to be portable why is it so wrong to expect a program to be thread safe?
His question was efficiency and the option I gave was more efficient and safer.
Why are system calls UB in C++? Lol what does that even mean? Even the C++RT relies on system calls. It is not portable. But making it portable is easy because all the major operating systems support creating a shared memory region, extending it and writing the contents of a shared memory region into a file. It will be slightly more work to make it portable across all OSes but the benefit are that
1. It will be much more efficient
2. It will be safer to use
3. It is the right thing to do
And using a string for this is what is a bad design choice as you are using something for purposes that it was not intended for. This is just a hack forgetting for a while all the other disadvantages I mentioned.
+
Anonymous
@you_Know_it0 you learning C for hacking?
⚛ Hz
Anonymous
some people learn C for ethical hacking
Anonymous
ah okey
⚛ Hz
That's the problem, no multiple thread mean no need those complexity
Anonymous
⚛ Hz
If you really concerned about safety, you may want to handle process killed by accidentally, and only half of the file have been written
+
I found this small project
https://github.com/m-byte918/Binary-Reader-Writer/blob/master/Buffer.cpp
its store all the data in a vector
⚛ Hz
⚛ Hz
The std interface is the only thing you can rely on(if you need be defined in standard
Anonymous
Anonymous
Read about what the C++ standard defines as Undefined Behavior here
And no system calls are not UB according to the standards as long as you call them the right way.
⚛ Hz
Ok, that's my fault, I'm confused with rust's unsafe
⚛ Hz
Anonymous