What is the most effective way of iterating a std::vector and why?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







18















In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?



Way 1:



for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}


Way 2:



for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 3:



for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 4:



for(auto const& value: a) {
/* std::cout << value; ... */









share|improve this question




















  • 2





    Possible duplicate of What are the complexity guarantees of the standard containers?

    – Michael Chourdakis
    23 hours ago






  • 15





    They are equivalent. Sometimes the compiler may even turn one into the other.

    – Marc Glisse
    23 hours ago








  • 7





    Way 5: for(auto&& value: a) { is the best.

    – Bathsheba
    22 hours ago








  • 6





    They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

    – Marco13
    17 hours ago






  • 2





    You are missing an alternative: NOT re-computing v.size() every single iteration...

    – Matthieu M.
    16 hours ago


















18















In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?



Way 1:



for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}


Way 2:



for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 3:



for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 4:



for(auto const& value: a) {
/* std::cout << value; ... */









share|improve this question




















  • 2





    Possible duplicate of What are the complexity guarantees of the standard containers?

    – Michael Chourdakis
    23 hours ago






  • 15





    They are equivalent. Sometimes the compiler may even turn one into the other.

    – Marc Glisse
    23 hours ago








  • 7





    Way 5: for(auto&& value: a) { is the best.

    – Bathsheba
    22 hours ago








  • 6





    They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

    – Marco13
    17 hours ago






  • 2





    You are missing an alternative: NOT re-computing v.size() every single iteration...

    – Matthieu M.
    16 hours ago














18












18








18


3






In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?



Way 1:



for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}


Way 2:



for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 3:



for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 4:



for(auto const& value: a) {
/* std::cout << value; ... */









share|improve this question
















In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?



Way 1:



for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}


Way 2:



for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 3:



for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}


Way 4:



for(auto const& value: a) {
/* std::cout << value; ... */






c++ performance stl iterator






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 22 hours ago









mch

5,44821733




5,44821733










asked 23 hours ago









getsuhagetsuha

1096




1096








  • 2





    Possible duplicate of What are the complexity guarantees of the standard containers?

    – Michael Chourdakis
    23 hours ago






  • 15





    They are equivalent. Sometimes the compiler may even turn one into the other.

    – Marc Glisse
    23 hours ago








  • 7





    Way 5: for(auto&& value: a) { is the best.

    – Bathsheba
    22 hours ago








  • 6





    They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

    – Marco13
    17 hours ago






  • 2





    You are missing an alternative: NOT re-computing v.size() every single iteration...

    – Matthieu M.
    16 hours ago














  • 2





    Possible duplicate of What are the complexity guarantees of the standard containers?

    – Michael Chourdakis
    23 hours ago






  • 15





    They are equivalent. Sometimes the compiler may even turn one into the other.

    – Marc Glisse
    23 hours ago








  • 7





    Way 5: for(auto&& value: a) { is the best.

    – Bathsheba
    22 hours ago








  • 6





    They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

    – Marco13
    17 hours ago






  • 2





    You are missing an alternative: NOT re-computing v.size() every single iteration...

    – Matthieu M.
    16 hours ago








2




2





Possible duplicate of What are the complexity guarantees of the standard containers?

– Michael Chourdakis
23 hours ago





Possible duplicate of What are the complexity guarantees of the standard containers?

– Michael Chourdakis
23 hours ago




15




15





They are equivalent. Sometimes the compiler may even turn one into the other.

– Marc Glisse
23 hours ago







They are equivalent. Sometimes the compiler may even turn one into the other.

– Marc Glisse
23 hours ago






7




7





Way 5: for(auto&& value: a) { is the best.

– Bathsheba
22 hours ago







Way 5: for(auto&& value: a) { is the best.

– Bathsheba
22 hours ago






6




6





They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

– Marco13
17 hours ago





They are equally effective. If you want to know which one is more efficient, consider changing the title of the question.

– Marco13
17 hours ago




2




2





You are missing an alternative: NOT re-computing v.size() every single iteration...

– Matthieu M.
16 hours ago





You are missing an alternative: NOT re-computing v.size() every single iteration...

– Matthieu M.
16 hours ago












8 Answers
8






active

oldest

votes


















33














First of all, Way 2 and Way 3 are identical in practically all standard library implementations.



Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.



If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to



{
auto && __range = range_expression ;
auto __begin = begin_expr ;
auto __end = end_expr ;
for ( ; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}


The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.



for (auto value: a) { /* ... */ }


this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.



Note that with the parallel algorithm facilities in C++17, you can also try out



#include <algorithm>
#include <execution>

std::for_each(std::par_unseq, a.cbegin(), a.cend(),
(const auto& e) { /* do stuff... */ });


but whether this is faster than an ordinary loop depends on may circumstantial details.






share|improve this answer


























  • Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

    – bremen_matt
    9 hours ago








  • 2





    @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

    – opa
    8 hours ago











  • The parallel versions? That is, std::par_unseq

    – bremen_matt
    19 mins ago











  • I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

    – bremen_matt
    11 mins ago











  • One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

    – bremen_matt
    9 mins ago



















9














Addition to lubgr's answer:



Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.



Indices can be useful if you have steps greater than 1 (whyever you would need to...):



for(size_t i = 0; i < v.size(); i += 2) { ... }


While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...






share|improve this answer


























  • To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

    – YSC
    21 hours ago





















6














Prefer iterators over indices/keys.



While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.



1As long as you use instead of .at() for accesssing by index, of course.





Memorize the end-bound.



Recomputing the end-bound at each iteration is inefficient for two reasons:




  • In general: a local variable is not aliased, which is more optimizer-friendly.

  • On containers other than vector: computing the end/size could be a bit more expensive.


You can do so as a one-liner:



for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }


(This is an exception to the general prohibition on declaring a single variable at a time.)





Use the for-each loop form.



The for-each loop form will automatically:




  • Use iterators.

  • Memorize the end-bound.


Thus:



for (/*...*/ value : vec) { ... }




Take built-in types by values, other types by reference.



There is a non-obvious trade-off between taking an element by value and taking an element by reference:




  • Taking an element by reference avoids a copy, which can be an expensive operation.

  • Taking an element by value is more optimizer-friendly1.


At the extremes, the choice should be obvious:




  • Built-in types (int, std::int64_t, void*, ...) should be taken by value.

  • Potentially allocating types (std::string, ...) should be taken by reference.


In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.



Thus, the general form is:



for (auto& element : vec) { ... }


And if you are dealing with a built-in:



for (int element : vec) { ... }


1This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.






share|improve this answer





















  • 1





    Out of interest, why didn't you consider for (auto&& element : vec)?

    – Bathsheba
    16 hours ago






  • 2





    @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

    – Matthieu M.
    15 hours ago



















1














The lazy answer: The complexities are equivalent.




  • The time complexity of all solutions is Θ(n).

  • The space complexity of all solutions is Θ(1).


The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.



It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.



Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.






share|improve this answer































    1














    For completeness, I wanted to mention that your loop might want to change the size of the vector.



    std::vector<int> v = get_some_data();
    for (std::size_t i=0; i<v.size(); ++i)
    {
    int x = some_function(v[i]);
    if(x) v.push_back(x);
    }


    In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.



    If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.



    By the way, I prefer to use while-loops for such cases over for-loops but that's another story.






    share|improve this answer
























    • An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

      – davidbak
      12 hours ago



















    0














    All of the ways you listed have identical time complexity and identical space complexity (no surprise there).



    Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).



    We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p



    On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.






    share|improve this answer
























    • "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

      – Cody Gray
      12 hours ago



















    0














    It depends to a large extent on what you mean by "effective".



    Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.



    From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.



    Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?



    for (auto it = v1.cbegin();  it != v1.end();  ++it)
    {
    std::cout << v1[i];
    }

    for (auto i = 0u; i < v2.size(); ++i)
    {
    std::cout << v1[i];
    }

    for (auto const i: v3)
    {
    std::cout << i;
    }


    Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!





    ¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...






    share|improve this answer































      0














      Prefer method 4, std::for_each (if you really must), or method 5/6:



      void method5(std::vector<float>& v) {
      for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
      *it *= *it;
      }
      }
      void method6(std::vector<float>& v) {
      auto ptr = v.data();
      for(std::size_t i = 0, n = v.size(); i != n; i++) {
      ptr[i] *= ptr[i];
      }
      }


      The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate end() and size() in each iteration. This will prevent all SIMD optimisations.



      You can see proof here:



      https://godbolt.org/z/BchhmU



      You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.



      Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.






      share|improve this answer
























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606153%2fwhat-is-the-most-effective-way-of-iterating-a-stdvector-and-why%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        8 Answers
        8






        active

        oldest

        votes








        8 Answers
        8






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        33














        First of all, Way 2 and Way 3 are identical in practically all standard library implementations.



        Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.



        If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to



        {
        auto && __range = range_expression ;
        auto __begin = begin_expr ;
        auto __end = end_expr ;
        for ( ; __begin != __end; ++__begin) {
        range_declaration = *__begin;
        loop_statement
        }
        }


        The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.



        for (auto value: a) { /* ... */ }


        this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.



        Note that with the parallel algorithm facilities in C++17, you can also try out



        #include <algorithm>
        #include <execution>

        std::for_each(std::par_unseq, a.cbegin(), a.cend(),
        (const auto& e) { /* do stuff... */ });


        but whether this is faster than an ordinary loop depends on may circumstantial details.






        share|improve this answer


























        • Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

          – bremen_matt
          9 hours ago








        • 2





          @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

          – opa
          8 hours ago











        • The parallel versions? That is, std::par_unseq

          – bremen_matt
          19 mins ago











        • I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

          – bremen_matt
          11 mins ago











        • One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

          – bremen_matt
          9 mins ago
















        33














        First of all, Way 2 and Way 3 are identical in practically all standard library implementations.



        Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.



        If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to



        {
        auto && __range = range_expression ;
        auto __begin = begin_expr ;
        auto __end = end_expr ;
        for ( ; __begin != __end; ++__begin) {
        range_declaration = *__begin;
        loop_statement
        }
        }


        The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.



        for (auto value: a) { /* ... */ }


        this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.



        Note that with the parallel algorithm facilities in C++17, you can also try out



        #include <algorithm>
        #include <execution>

        std::for_each(std::par_unseq, a.cbegin(), a.cend(),
        (const auto& e) { /* do stuff... */ });


        but whether this is faster than an ordinary loop depends on may circumstantial details.






        share|improve this answer


























        • Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

          – bremen_matt
          9 hours ago








        • 2





          @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

          – opa
          8 hours ago











        • The parallel versions? That is, std::par_unseq

          – bremen_matt
          19 mins ago











        • I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

          – bremen_matt
          11 mins ago











        • One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

          – bremen_matt
          9 mins ago














        33












        33








        33







        First of all, Way 2 and Way 3 are identical in practically all standard library implementations.



        Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.



        If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to



        {
        auto && __range = range_expression ;
        auto __begin = begin_expr ;
        auto __end = end_expr ;
        for ( ; __begin != __end; ++__begin) {
        range_declaration = *__begin;
        loop_statement
        }
        }


        The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.



        for (auto value: a) { /* ... */ }


        this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.



        Note that with the parallel algorithm facilities in C++17, you can also try out



        #include <algorithm>
        #include <execution>

        std::for_each(std::par_unseq, a.cbegin(), a.cend(),
        (const auto& e) { /* do stuff... */ });


        but whether this is faster than an ordinary loop depends on may circumstantial details.






        share|improve this answer















        First of all, Way 2 and Way 3 are identical in practically all standard library implementations.



        Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.



        If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to



        {
        auto && __range = range_expression ;
        auto __begin = begin_expr ;
        auto __end = end_expr ;
        for ( ; __begin != __end; ++__begin) {
        range_declaration = *__begin;
        loop_statement
        }
        }


        The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.



        for (auto value: a) { /* ... */ }


        this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.



        Note that with the parallel algorithm facilities in C++17, you can also try out



        #include <algorithm>
        #include <execution>

        std::for_each(std::par_unseq, a.cbegin(), a.cend(),
        (const auto& e) { /* do stuff... */ });


        but whether this is faster than an ordinary loop depends on may circumstantial details.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 22 hours ago

























        answered 22 hours ago









        lubgrlubgr

        15.3k32353




        15.3k32353













        • Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

          – bremen_matt
          9 hours ago








        • 2





          @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

          – opa
          8 hours ago











        • The parallel versions? That is, std::par_unseq

          – bremen_matt
          19 mins ago











        • I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

          – bremen_matt
          11 mins ago











        • One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

          – bremen_matt
          9 mins ago



















        • Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

          – bremen_matt
          9 hours ago








        • 2





          @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

          – opa
          8 hours ago











        • The parallel versions? That is, std::par_unseq

          – bremen_matt
          19 mins ago











        • I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

          – bremen_matt
          11 mins ago











        • One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

          – bremen_matt
          9 mins ago

















        Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

        – bremen_matt
        9 hours ago







        Good luck finding those C++17 features in a compiler. Those features are virtually non existant. I wish the compilers would just implement the Intel STL version and be done with it already

        – bremen_matt
        9 hours ago






        2




        2





        @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

        – opa
        8 hours ago





        @bremen_matt AFAIK, GCC is really the odd man out here, and both clang and MSVC implement this already?

        – opa
        8 hours ago













        The parallel versions? That is, std::par_unseq

        – bremen_matt
        19 mins ago





        The parallel versions? That is, std::par_unseq

        – bremen_matt
        19 mins ago













        I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

        – bremen_matt
        11 mins ago





        I don't know about that... I can't get a simple code to compile with either GCC head or Clang head: wandbox.org/permlink/cnCxFscI0SWV1UF0

        – bremen_matt
        11 mins ago













        One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

        – bremen_matt
        9 mins ago





        One place I found tracking the progress is: en.cppreference.com/w/cpp/compiler_support

        – bremen_matt
        9 mins ago













        9














        Addition to lubgr's answer:



        Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.



        Indices can be useful if you have steps greater than 1 (whyever you would need to...):



        for(size_t i = 0; i < v.size(); i += 2) { ... }


        While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...






        share|improve this answer


























        • To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

          – YSC
          21 hours ago


















        9














        Addition to lubgr's answer:



        Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.



        Indices can be useful if you have steps greater than 1 (whyever you would need to...):



        for(size_t i = 0; i < v.size(); i += 2) { ... }


        While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...






        share|improve this answer


























        • To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

          – YSC
          21 hours ago
















        9












        9








        9







        Addition to lubgr's answer:



        Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.



        Indices can be useful if you have steps greater than 1 (whyever you would need to...):



        for(size_t i = 0; i < v.size(); i += 2) { ... }


        While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...






        share|improve this answer















        Addition to lubgr's answer:



        Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.



        Indices can be useful if you have steps greater than 1 (whyever you would need to...):



        for(size_t i = 0; i < v.size(); i += 2) { ... }


        While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 15 hours ago









        Toby Speight

        17.5k134469




        17.5k134469










        answered 22 hours ago









        AconcaguaAconcagua

        13.5k32445




        13.5k32445













        • To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

          – YSC
          21 hours ago





















        • To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

          – YSC
          21 hours ago



















        To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

        – YSC
        21 hours ago







        To add on why this should not be considered: either the compiler cannot optimize this and there is no point to use the tools provided by the Standard Library (<vector>, <algorithm>, etc.) if you expect a minimum of performance; either it does (and it does and losing time and effort on this minor difference never matters unless you pinpoint through a profiler a bottleneck in your implementation.

        – YSC
        21 hours ago













        6














        Prefer iterators over indices/keys.



        While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.



        1As long as you use instead of .at() for accesssing by index, of course.





        Memorize the end-bound.



        Recomputing the end-bound at each iteration is inefficient for two reasons:




        • In general: a local variable is not aliased, which is more optimizer-friendly.

        • On containers other than vector: computing the end/size could be a bit more expensive.


        You can do so as a one-liner:



        for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }


        (This is an exception to the general prohibition on declaring a single variable at a time.)





        Use the for-each loop form.



        The for-each loop form will automatically:




        • Use iterators.

        • Memorize the end-bound.


        Thus:



        for (/*...*/ value : vec) { ... }




        Take built-in types by values, other types by reference.



        There is a non-obvious trade-off between taking an element by value and taking an element by reference:




        • Taking an element by reference avoids a copy, which can be an expensive operation.

        • Taking an element by value is more optimizer-friendly1.


        At the extremes, the choice should be obvious:




        • Built-in types (int, std::int64_t, void*, ...) should be taken by value.

        • Potentially allocating types (std::string, ...) should be taken by reference.


        In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.



        Thus, the general form is:



        for (auto& element : vec) { ... }


        And if you are dealing with a built-in:



        for (int element : vec) { ... }


        1This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.






        share|improve this answer





















        • 1





          Out of interest, why didn't you consider for (auto&& element : vec)?

          – Bathsheba
          16 hours ago






        • 2





          @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

          – Matthieu M.
          15 hours ago
















        6














        Prefer iterators over indices/keys.



        While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.



        1As long as you use instead of .at() for accesssing by index, of course.





        Memorize the end-bound.



        Recomputing the end-bound at each iteration is inefficient for two reasons:




        • In general: a local variable is not aliased, which is more optimizer-friendly.

        • On containers other than vector: computing the end/size could be a bit more expensive.


        You can do so as a one-liner:



        for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }


        (This is an exception to the general prohibition on declaring a single variable at a time.)





        Use the for-each loop form.



        The for-each loop form will automatically:




        • Use iterators.

        • Memorize the end-bound.


        Thus:



        for (/*...*/ value : vec) { ... }




        Take built-in types by values, other types by reference.



        There is a non-obvious trade-off between taking an element by value and taking an element by reference:




        • Taking an element by reference avoids a copy, which can be an expensive operation.

        • Taking an element by value is more optimizer-friendly1.


        At the extremes, the choice should be obvious:




        • Built-in types (int, std::int64_t, void*, ...) should be taken by value.

        • Potentially allocating types (std::string, ...) should be taken by reference.


        In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.



        Thus, the general form is:



        for (auto& element : vec) { ... }


        And if you are dealing with a built-in:



        for (int element : vec) { ... }


        1This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.






        share|improve this answer





















        • 1





          Out of interest, why didn't you consider for (auto&& element : vec)?

          – Bathsheba
          16 hours ago






        • 2





          @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

          – Matthieu M.
          15 hours ago














        6












        6








        6







        Prefer iterators over indices/keys.



        While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.



        1As long as you use instead of .at() for accesssing by index, of course.





        Memorize the end-bound.



        Recomputing the end-bound at each iteration is inefficient for two reasons:




        • In general: a local variable is not aliased, which is more optimizer-friendly.

        • On containers other than vector: computing the end/size could be a bit more expensive.


        You can do so as a one-liner:



        for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }


        (This is an exception to the general prohibition on declaring a single variable at a time.)





        Use the for-each loop form.



        The for-each loop form will automatically:




        • Use iterators.

        • Memorize the end-bound.


        Thus:



        for (/*...*/ value : vec) { ... }




        Take built-in types by values, other types by reference.



        There is a non-obvious trade-off between taking an element by value and taking an element by reference:




        • Taking an element by reference avoids a copy, which can be an expensive operation.

        • Taking an element by value is more optimizer-friendly1.


        At the extremes, the choice should be obvious:




        • Built-in types (int, std::int64_t, void*, ...) should be taken by value.

        • Potentially allocating types (std::string, ...) should be taken by reference.


        In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.



        Thus, the general form is:



        for (auto& element : vec) { ... }


        And if you are dealing with a built-in:



        for (int element : vec) { ... }


        1This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.






        share|improve this answer















        Prefer iterators over indices/keys.



        While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.



        1As long as you use instead of .at() for accesssing by index, of course.





        Memorize the end-bound.



        Recomputing the end-bound at each iteration is inefficient for two reasons:




        • In general: a local variable is not aliased, which is more optimizer-friendly.

        • On containers other than vector: computing the end/size could be a bit more expensive.


        You can do so as a one-liner:



        for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }


        (This is an exception to the general prohibition on declaring a single variable at a time.)





        Use the for-each loop form.



        The for-each loop form will automatically:




        • Use iterators.

        • Memorize the end-bound.


        Thus:



        for (/*...*/ value : vec) { ... }




        Take built-in types by values, other types by reference.



        There is a non-obvious trade-off between taking an element by value and taking an element by reference:




        • Taking an element by reference avoids a copy, which can be an expensive operation.

        • Taking an element by value is more optimizer-friendly1.


        At the extremes, the choice should be obvious:




        • Built-in types (int, std::int64_t, void*, ...) should be taken by value.

        • Potentially allocating types (std::string, ...) should be taken by reference.


        In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.



        Thus, the general form is:



        for (auto& element : vec) { ... }


        And if you are dealing with a built-in:



        for (int element : vec) { ... }


        1This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 12 hours ago









        Cody Gray

        195k35384475




        195k35384475










        answered 16 hours ago









        Matthieu M.Matthieu M.

        206k29284521




        206k29284521








        • 1





          Out of interest, why didn't you consider for (auto&& element : vec)?

          – Bathsheba
          16 hours ago






        • 2





          @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

          – Matthieu M.
          15 hours ago














        • 1





          Out of interest, why didn't you consider for (auto&& element : vec)?

          – Bathsheba
          16 hours ago






        • 2





          @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

          – Matthieu M.
          15 hours ago








        1




        1





        Out of interest, why didn't you consider for (auto&& element : vec)?

        – Bathsheba
        16 hours ago





        Out of interest, why didn't you consider for (auto&& element : vec)?

        – Bathsheba
        16 hours ago




        2




        2





        @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

        – Matthieu M.
        15 hours ago





        @Bathsheba: I find it confusing. We know that it will be a reference, so why mark it as a universal reference?

        – Matthieu M.
        15 hours ago











        1














        The lazy answer: The complexities are equivalent.




        • The time complexity of all solutions is Θ(n).

        • The space complexity of all solutions is Θ(1).


        The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.



        It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.



        Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.






        share|improve this answer




























          1














          The lazy answer: The complexities are equivalent.




          • The time complexity of all solutions is Θ(n).

          • The space complexity of all solutions is Θ(1).


          The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.



          It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.



          Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.






          share|improve this answer


























            1












            1








            1







            The lazy answer: The complexities are equivalent.




            • The time complexity of all solutions is Θ(n).

            • The space complexity of all solutions is Θ(1).


            The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.



            It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.



            Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.






            share|improve this answer













            The lazy answer: The complexities are equivalent.




            • The time complexity of all solutions is Θ(n).

            • The space complexity of all solutions is Θ(1).


            The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.



            It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.



            Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 20 hours ago









            Arne VogelArne Vogel

            5,00521326




            5,00521326























                1














                For completeness, I wanted to mention that your loop might want to change the size of the vector.



                std::vector<int> v = get_some_data();
                for (std::size_t i=0; i<v.size(); ++i)
                {
                int x = some_function(v[i]);
                if(x) v.push_back(x);
                }


                In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.



                If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.



                By the way, I prefer to use while-loops for such cases over for-loops but that's another story.






                share|improve this answer
























                • An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                  – davidbak
                  12 hours ago
















                1














                For completeness, I wanted to mention that your loop might want to change the size of the vector.



                std::vector<int> v = get_some_data();
                for (std::size_t i=0; i<v.size(); ++i)
                {
                int x = some_function(v[i]);
                if(x) v.push_back(x);
                }


                In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.



                If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.



                By the way, I prefer to use while-loops for such cases over for-loops but that's another story.






                share|improve this answer
























                • An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                  – davidbak
                  12 hours ago














                1












                1








                1







                For completeness, I wanted to mention that your loop might want to change the size of the vector.



                std::vector<int> v = get_some_data();
                for (std::size_t i=0; i<v.size(); ++i)
                {
                int x = some_function(v[i]);
                if(x) v.push_back(x);
                }


                In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.



                If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.



                By the way, I prefer to use while-loops for such cases over for-loops but that's another story.






                share|improve this answer













                For completeness, I wanted to mention that your loop might want to change the size of the vector.



                std::vector<int> v = get_some_data();
                for (std::size_t i=0; i<v.size(); ++i)
                {
                int x = some_function(v[i]);
                if(x) v.push_back(x);
                }


                In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.



                If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.



                By the way, I prefer to use while-loops for such cases over for-loops but that's another story.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 18 hours ago









                Handy999Handy999

                69018




                69018













                • An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                  – davidbak
                  12 hours ago



















                • An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                  – davidbak
                  12 hours ago

















                An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                – davidbak
                12 hours ago





                An important supplemental answer because the OP's question specifically called out iterating over std::vector - as opposed to iterating over an arbitrary STL-ish container.

                – davidbak
                12 hours ago











                0














                All of the ways you listed have identical time complexity and identical space complexity (no surprise there).



                Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).



                We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p



                On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.






                share|improve this answer
























                • "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                  – Cody Gray
                  12 hours ago
















                0














                All of the ways you listed have identical time complexity and identical space complexity (no surprise there).



                Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).



                We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p



                On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.






                share|improve this answer
























                • "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                  – Cody Gray
                  12 hours ago














                0












                0








                0







                All of the ways you listed have identical time complexity and identical space complexity (no surprise there).



                Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).



                We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p



                On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.






                share|improve this answer













                All of the ways you listed have identical time complexity and identical space complexity (no surprise there).



                Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).



                We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p



                On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 22 hours ago









                Jorge PerezJorge Perez

                2,071719




                2,071719













                • "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                  – Cody Gray
                  12 hours ago



















                • "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                  – Cody Gray
                  12 hours ago

















                "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                – Cody Gray
                12 hours ago





                "On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2." It does that because the backend code generator knows that they are equivalent (in other words, they produce exactly the same machine code).

                – Cody Gray
                12 hours ago











                0














                It depends to a large extent on what you mean by "effective".



                Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.



                From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.



                Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?



                for (auto it = v1.cbegin();  it != v1.end();  ++it)
                {
                std::cout << v1[i];
                }

                for (auto i = 0u; i < v2.size(); ++i)
                {
                std::cout << v1[i];
                }

                for (auto const i: v3)
                {
                std::cout << i;
                }


                Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!





                ¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...






                share|improve this answer




























                  0














                  It depends to a large extent on what you mean by "effective".



                  Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.



                  From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.



                  Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?



                  for (auto it = v1.cbegin();  it != v1.end();  ++it)
                  {
                  std::cout << v1[i];
                  }

                  for (auto i = 0u; i < v2.size(); ++i)
                  {
                  std::cout << v1[i];
                  }

                  for (auto const i: v3)
                  {
                  std::cout << i;
                  }


                  Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!





                  ¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...






                  share|improve this answer


























                    0












                    0








                    0







                    It depends to a large extent on what you mean by "effective".



                    Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.



                    From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.



                    Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?



                    for (auto it = v1.cbegin();  it != v1.end();  ++it)
                    {
                    std::cout << v1[i];
                    }

                    for (auto i = 0u; i < v2.size(); ++i)
                    {
                    std::cout << v1[i];
                    }

                    for (auto const i: v3)
                    {
                    std::cout << i;
                    }


                    Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!





                    ¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...






                    share|improve this answer













                    It depends to a large extent on what you mean by "effective".



                    Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.



                    From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.



                    Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?



                    for (auto it = v1.cbegin();  it != v1.end();  ++it)
                    {
                    std::cout << v1[i];
                    }

                    for (auto i = 0u; i < v2.size(); ++i)
                    {
                    std::cout << v1[i];
                    }

                    for (auto const i: v3)
                    {
                    std::cout << i;
                    }


                    Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!





                    ¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 14 hours ago









                    Toby SpeightToby Speight

                    17.5k134469




                    17.5k134469























                        0














                        Prefer method 4, std::for_each (if you really must), or method 5/6:



                        void method5(std::vector<float>& v) {
                        for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
                        *it *= *it;
                        }
                        }
                        void method6(std::vector<float>& v) {
                        auto ptr = v.data();
                        for(std::size_t i = 0, n = v.size(); i != n; i++) {
                        ptr[i] *= ptr[i];
                        }
                        }


                        The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate end() and size() in each iteration. This will prevent all SIMD optimisations.



                        You can see proof here:



                        https://godbolt.org/z/BchhmU



                        You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.



                        Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.






                        share|improve this answer




























                          0














                          Prefer method 4, std::for_each (if you really must), or method 5/6:



                          void method5(std::vector<float>& v) {
                          for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
                          *it *= *it;
                          }
                          }
                          void method6(std::vector<float>& v) {
                          auto ptr = v.data();
                          for(std::size_t i = 0, n = v.size(); i != n; i++) {
                          ptr[i] *= ptr[i];
                          }
                          }


                          The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate end() and size() in each iteration. This will prevent all SIMD optimisations.



                          You can see proof here:



                          https://godbolt.org/z/BchhmU



                          You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.



                          Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.






                          share|improve this answer


























                            0












                            0








                            0







                            Prefer method 4, std::for_each (if you really must), or method 5/6:



                            void method5(std::vector<float>& v) {
                            for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
                            *it *= *it;
                            }
                            }
                            void method6(std::vector<float>& v) {
                            auto ptr = v.data();
                            for(std::size_t i = 0, n = v.size(); i != n; i++) {
                            ptr[i] *= ptr[i];
                            }
                            }


                            The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate end() and size() in each iteration. This will prevent all SIMD optimisations.



                            You can see proof here:



                            https://godbolt.org/z/BchhmU



                            You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.



                            Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.






                            share|improve this answer













                            Prefer method 4, std::for_each (if you really must), or method 5/6:



                            void method5(std::vector<float>& v) {
                            for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
                            *it *= *it;
                            }
                            }
                            void method6(std::vector<float>& v) {
                            auto ptr = v.data();
                            for(std::size_t i = 0, n = v.size(); i != n; i++) {
                            ptr[i] *= ptr[i];
                            }
                            }


                            The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate end() and size() in each iteration. This will prevent all SIMD optimisations.



                            You can see proof here:



                            https://godbolt.org/z/BchhmU



                            You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.



                            Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 4 hours ago









                            robtheblokerobthebloke

                            29114




                            29114






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606153%2fwhat-is-the-most-effective-way-of-iterating-a-stdvector-and-why%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Knooppunt Holsloot

                                Altaar (religie)

                                Gregoriusmis