std::string and contiguous memory

There was a reply to a post that said:

...A vector is guaranteed to store the elements in contiguous memory. This is one of the requirements of vector<T>. Internally a std::string is *not* guaranteed to store elements contiguously, so vector<char> and std::string may not be the same in terms of how memory is laid out. ...
Is this true (the bold part)? I would think that it would be more practical to require that std::string be contiguous and std::vector does not have to be contiguous.

Thanks,
John Flegert
[549 byte] By [jflegert] at [2007-11-17 22:37:47]
# 1 Re: std::string and contiguous memory
std::vector is contiguous.

from Josuttis's book : "The C++ standard library does not
state clearly whether the elements of a vector are required
to be contiguous memory. However, it is the intention that
this is guaranteed and it will be fixed due to a defect report"

from Meyers book concerning using vector and string
in legacy code: "The approach to getting a pointer to the
beginning of the container data that works for vector isn't
reliable for strings, because (1) the data for strings are
not guaranteed to be stored in contiguous memory, and
(2) the internal representation of a string is not guaranteed
to end with a null character."
Philip Nicoletti at 2007-11-8 1:11:12 >
# 2 Re: std::string and contiguous memory
Regarding std::string, this just doesn't seem practical. Think of what the implementation would have to be like if they weren't contiguous (or didn't have '\0' on the end) for std::string::c_str().
jflegert at 2007-11-8 1:12:16 >
# 3 Re: std::string and contiguous memory
It is a matter of what is guaranteed. Also from
Meyer's book concerning c_str() ... : "... there is no
guarantee that c_str yields a pointer to the internal
representation of the string data. It could return a
pointer to an unmodifiable copy of the strings data,
one that is correctly formatted for a C API. (If this makes
the efficiency hairs on your neck rise up in alarm, rest
assured that the alarm is probably false. I don't know
of any contemporary libray implementation that takes
advantage of this latitude.)"
Philip Nicoletti at 2007-11-8 1:13:22 >
# 4 Re: std::string and contiguous memory
quote:
------------------------
...A vector is guaranteed to store the elements in contiguous memory. This is one of the requirements of vector<T>. Internally a std::string is *not* guaranteed to store elements contiguously, so vector<char> and std::string may not be the same in terms of how memory is laid out. ...
------------------------

C++ doesn't say any thing about the implementation of string or vector.
The vector defination states by the C++ standard is
23.2.4
[
A vector is a kind of sequence that supports random access iterators.
In addition, it supports (amortized) constant time insert and erase
operations at the end; insert and erase in the middle take linear
time. Storage management is handled automatically, though hints can
be given to improve efficiency.
]
As per C++ concern if ur satisfying all the requirment of the vector it doesn't matter what kind of memory management are you using

regarding your statement it may be true or may not be.
if you want to find out about particular implementation like MSVC or any other compiler u have to debug urself.

Vinod
vinodp at 2007-11-8 1:14:21 >
# 5 Re: std::string and contiguous memory
As Phillip pointed out (and so has Nicolai Josuttis), this is a defect that has been reported and has been corrected by ANSI since the standard document was released. If you are using a vector that doesn't use contiguous storage, it is a non-conforming version of vector.

Scott Meyer's has dedicated an entire chapter in his "Essential STL" book to usage of vector with legacy 'C' API's. He wouldn't have written it if he knew that vector was not guaranteed to be contiguous.

Regards,

Paul McKenzie
Paul McKenzie at 2007-11-8 1:15:20 >
# 6 Re: std::string and contiguous memory
Originally posted by Paul McKenzie
Scott Meyer's has dedicated an entire chapter in his "Essential STL" book to usage of vector with legacy 'C' API's. He wouldn't have written it if he knew that vector was not guaranteed to be contiguous.

A small remark...the book that you are referring to is named "Effective STL" - just in case someone wants to take a look at it...
Andreas Masur at 2007-11-8 1:16:22 >
# 7 Re: std::string and contiguous memory
It is true that the C++ standard do NOT guarantee that internal data used in std::string occupies continuous memory.

But it is also true in life that things that are NOT guaranteed by the regulation or otherwise on the paper are usually guaranteed by practicality.

It is perfectly legal to overpay your tax. But in practice few people are stupid enough to intentionally over pay their taxes. It is also perfectly standard compliant to implement certain things in certain mind twisted way, but no one would do it.

It is just so natural and straightforward to implement std::string using continuous memory block, that an implementation that does not use continuous memory would be hard pushed to be conceiveable.

In fact, no one uses none-continuous memory in implementing std::string.
AnthonyMai at 2007-11-8 1:17:22 >
# 8 Re: std::string and contiguous memory
Anythony...

Have you ever looked into the number of people who get a TAX REFUND each year. For 2000 (i could not locate the figures for 2001) it was over 130 MILLION. By your logic these people are all stupid...

...It is perfectly legal to overpay your tax. But in practice few people are stupid enough to intentionally over pay their taxes...

In terms of libraries that do NOT store strings as contigous character strings, please take a look at the Linguistic Processor Libraries. Internally their "strings" are stored as "...multiple chains of information...". Each "...unit sequence of characters [typically a word]...", is internally represended by a number of collections. The actual ASCII text (I have heard they now have a unicode version, but by copy is about 5 years old), for a given "word" is only stored ONCE, regardless of the number of times it appears in the document. A sentence is made of of a vector of token pointers (for word order) and a collection of "grammer terms" (e.g. Noun(s), Verb, Adjective(s), etc). In addition each token that represents a word has a series of reference pointers back to the grammer elements that contain them.

These classes are reperensented to the strings as the user, totally hiding the complexity (and power) of the library to appear as strings, nearly every place that you can use a std::string, you can use one of these objects, all DEFINED operations on a std::string may be performed on these strings as well.

Just wanted to point out that while there may not be any "compiler" libraried that implement std::string as anything other than a contigous (probably NULL terminated) set of characters.. There DEFINATELY ARE third party libraries that use other internal representations.
TheCPUWizard at 2007-11-8 1:18:27 >
# 9 Re: std::string and contiguous memory
Originally posted by TheCPUWizard
In terms of libraries that do NOT store strings as contigous character strings, please take a look at the Linguistic Processor Libraries. Internally their "strings" are stored as "...multiple chains of information...".Also check out SGI's "rope" class for a non-contiguous string (not exactly std::string) implementation.

http://www.sgi.com/tech/stl/Rope.html

Regards,

Paul McKenzie
Paul McKenzie at 2007-11-8 1:19:24 >