std::transform doesnt always work
compiler: VC++ 6.0 (SP 6) and XP.
I'm attempting do a case-insensitive sort, but occasionally the transform function screws up the string -- that is, it does not correctly convert characters from lower to upper case. Example: "hello" becomes "(ELLO". What's wrong??
Also: is there a simpler way to do this?
Thanks
int comp(string& s1,string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),_toupper);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),_toupper);
return i1.compare(i2);
}
int main()
{
vector<string> theList;
theList.push_back("alice");
theList.push_back("Hello");
theList.push_back("below");
theList.push_back("hello");
sort(theList.begin(),theList.end(),comp);
vector<string>::iterator it;
for(it = theList.begin(); it != theList.end(); it++)
cout << *it << endl;
return 0;
}
I changed it to this and it seems to work now. But I still don't know why the previous version above didn't work
int comp(string& s1,string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),tolower);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),tolower);
return i1 < i2;
}
[1358 byte] By [
stober] at [2007-11-19 11:11:03]

# 1 Re: std::transform doesnt always work
Well this code:
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
using namespace std;
int comp(const string& s1, const string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),_toupper);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),_toupper);
return i1.compare(i2);
}
int main ( void )
{
vector<string> theList;
theList.push_back("alice");
theList.push_back("Hello");
theList.push_back("below");
theList.push_back("hello");
sort(theList.begin(),theList.end(),comp);
vector<string>::iterator it;
for(it = theList.begin(); it != theList.end(); it++)
cout << *it << endl;
system("PAUSE");
return 0;
}
Got me the following output:
hello
below
Hello
alice
Pass a const reference instead of a non-const.
# 2 Re: std::transform doesnt always work
Well this code:
Got me the following output:
hello
below
Hello
alice
Pass a const reference instead of a non-const.
yup -- that output is same as mine and wrong too
# 3 Re: std::transform doesnt always work
I could not reproduce the character truncation problem.
I would recommend the following sort predicate:
// Sort lexicographically...
bool comp (string& s1,string& s2)
{
const char* pszString1 = s1.c_str ();
const char* pszString2 = s2.c_str ();
return (stricmp (pszString1, pszString2) < 0);
};
# 4 Re: std::transform doesnt always work
#include <vector>
#include <string>
#include <iostream>
#include <algorithm>
using namespace std;
int comp(const string& s1,const string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),_toupper);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),_toupper);
return i1.compare(i2) < 0;
}
int main()
{
vector<string> theList;
theList.push_back("alice");
theList.push_back("Hello");
theList.push_back("below");
theList.push_back("hello");
sort(theList.begin(), theList.end(), comp);
vector<string>::const_iterator it;
for(it = theList.begin(); it != theList.end(); it++)
cout << *it << endl;
return 0;
}
This version works
Kurt
ZuK at 2007-11-9 0:49:31 >

# 5 Re: std::transform doesnt always work
1) why use "_toupper" instead of "toupper"
2) the predicate should return a bool , not an int ... so it should be
bool comp(const string& s1, const string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),toupper);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),toupper);
return i1.compare(i2) < 0;
}
3) stricmp() is not standard (I don't think anyway) ... here is a standard
way to do it ...
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <functional>
#include <cctype>
using namespace std;
namespace
{
struct case_insensitive_less : public std::binary_function< char,char,bool >
{
bool operator () (char x, char y)
{
return toupper( static_cast< unsigned char >(x)) <
toupper( static_cast< unsigned char >(y));
}
};
bool NoCaseLess(const std::string &a, const std::string &b)
{
return std::lexicographical_compare( a.begin(),a.end(),
b.begin(),b.end(), case_insensitive_less() );
}
}
int main()
{
vector<string> theList;
theList.push_back("alice");
theList.push_back("Hello");
theList.push_back("below");
theList.push_back("hello");
sort(theList.begin(),theList.end(),NoCaseLess);
vector<string>::iterator it;
for(it = theList.begin(); it != theList.end(); it++)
cout << *it << endl;
return 0;
}
# 6 Re: std::transform doesnt always work
This version worksIt doesn't -
Hello
alice
below
hello std::sort is supposed to take a binary predicate, like the one indicated in my post.
# 7 Re: std::transform doesnt always work
return i1.compare(i2) < 0;
I understand that this is a binary predicate.
Kurt
ZuK at 2007-11-9 0:52:35 >

# 8 Re: std::transform doesnt always work
3) stricmp() is not standard (I don't think anyway) ... True, stricmp it is not in the standard, but it is widely implemented (certainly on MSVC compiler as used by the Stober).
Agreed, the standard form is a good (albeit, longer) alternative... ;)
# 9 Re: std::transform doesnt always work
I understand that this is a binary predicate.A binary predicate is defined as one that take two input parameters and returns bool (not int).
# 10 Re: std::transform doesnt always work
Than's everyone. I really like the algorithm in post #6 -- seems to work better with both printable and non-printable characters.
# 11 Re: std::transform doesnt always work
I really like the algorithm in post #6 -- seems to work better with both printable and non-printable characters.I liked it very much as well... And would rather address it as Philip's post, and not "post #6"... :rolleyes:
;)
# 12 Re: std::transform doesnt always work
A binary predicate is defined as one that retuns bool (not int).
Agreed. But at least my compiler knows how to convert the expression
return i1.compare(i2) < 0;
into a bool. ( so the example does work )
K
ZuK at 2007-11-9 0:57:46 >

# 13 Re: std::transform doesnt always work
But at least my compiler knows how to convert the expression ... into a bool. ( so the example does work )Interesting.
Mine did not (MSVC 6.0)... :D
# 14 Re: std::transform doesnt always work
Mine did not (MSVC 6.0)... :D
I originally tested the example with g++. It did work.
Then I tried with MSVC 6.0 -> doesn't work. You are right.
Finally I replaced _toupper with toupper. This version again works with MSVC 6.0.
Really interesting.
Kurt
ZuK at 2007-11-9 0:59:43 >

# 15 Re: std::transform doesnt always work
Finally I replaced _toupper with toupper. This version again works with MSVC 6.0.True - toupper is standard compliant.
So, with this tiny modification your predicate is absolutely OK (hopefully on your compiler as well... ? Let me know... )
// Sort lexicographically...
bool comp(const string& s1,const string& s2)
{
string i1 = s1;
transform(i1.begin(),i1.end(),i1.begin(),toupper);
string i2 = s2;
transform(i2.begin(),i2.end(),i2.begin(),toupper);
return i1.compare(i2) < 0;
}
# 16 Re: std::transform doesnt always work
There is also collate in locale. That should probably be the one you use for case-insensitive string comparisons. It can handle letters with accents too (if the locale is one that supports those characters).
# 17 Re: std::transform doesnt always work
True, stricmp it is not in the standard, but it is widely implemented (certainly on MSVC compiler as used by the Stober).
Agreed, the standard form is a good (albeit, longer) alternative... ;)
Which is why one put the definition into a function that can be reused through-out the project in a very simple way:
if( NoCaseLess("test", "TEST") )
{
// ...
}
If I would do anything different it would be adding the possibility to pass a locale to the case insensitive predicate:
(Phillip's code modified, my changes in bold)
#include <cstddef>
#include <cstdlib>
#include <algorithm>
#include <functional>
#include <iostream>
#include <locale>
#include <string>
#include <vector>
namespace
{
struct case_insensitive_less
: public std::binary_function< char,char,bool >
{
case_insensitive_less()
: m_locale(std::locale::empty())
{
}
case_insensitive_less(std::locale const & l)
: m_locale(l)
{
}
bool operator () (char x, char y)
{
return
std::toupper(x, m_locale) <
std::toupper(y, m_locale);
}
private:
std::locale const m_locale;
};
bool NoCaseLess(
const std::string &a,
const std::string &b)
{
return std::lexicographical_compare(
a.begin(),
a.end(),
b.begin(),
b.end(),
case_insensitive_less() );
}
bool NoCaseLessGermanLocale(
const std::string &a,
const std::string &b)
{
return std::lexicographical_compare(
a.begin(),
a.end(),
b.begin(),
b.end(),
case_insensitive_less(std::locale("German_germany")) );
}
}
int main()
{
std::vector<std::string> theList;
theList.push_back("alice");
theList.push_back("Hello");
theList.push_back("below");
theList.push_back("hello");
std::sort(theList.begin(),theList.end(),NoCaseLessGermanLocale);
std::vector<std::string>::iterator it;
for(it = theList.begin(); it != theList.end(); it++)
{
std::cout << *it << std::endl;
}
return EXIT_SUCCESS;
}
Now it sorts using german locale (doesn't produce any difference in this case but it could have).
Hope this helps
# 18 Re: std::transform doesnt always work
Good point.
Here is a link to an article by Matt Austern (Ithink it might appear
as an appendix in Meyer's "Effective STL" book) ..
http://lafstern.org/matt/col2_new.pdf
# 19 Re: std::transform doesnt always work
Making comp return bool doesn't make any difference using MSVC6.0.
It works using toupper and fails using _toupper.
Using g++ 3.3.3 (cygwin special) the version using toupper fails to compile. Must have something to do with the #includes. Right now I don't have the time for further investigation. I'll continue in the evening.
Kurt
ZuK at 2007-11-9 1:04:50 >

# 20 Re: std::transform doesnt always work
Making comp return bool doesn't make any difference using MSVC6.0.
That's true if you just change the return value from int to bool. The comparison function screws sort() up if the comparison function returns anything other than true or false, which was one of the reasons the code in my original post didn't work -- the other reason was the use of _toupper() instead of toupper().
# 21 Re: std::transform doesnt always work
You can try the German example using some letters with umlat characters. If anyone German posts here, do those get placed alphabetically at the end or after their non-accented version?