Programming guide for classic С/С++

UNDER CONSTRUCTION

Quiz: What is a program output?

#include <iostream>

int main() {

std::cout << "Do you know C++ ??)";

return 0;

}

Introduction

This note is dedicated to people more or less familiar with C or/and C++ language. If you have read root books and have relevant experience, then you will be familiar with all things below.

On another side, I hope not only actual production people will use C/C++, but almost everybody including researchers and scientists.

Bibliography

[1] The C ++ Programming Language Special Edition, B.Stroustrup

[2] The C Programming Language, Harbison Steel

Stages of processing a C / C ++ program

1. Processing trigrams. If you're not familiar with this step then you will answer incorrectly for the quiz above.

2. Joining strings through the backslash character

3. Processing the program code by the preprocessor. The preprocessor can be built into the compiler, so it can be an independent program.

4. Lexical analysis of the program by splitting the program into tokens.

An important part is that the compiler always tries to assemble the longest token of characters by processing the text from left to right, even if the result is an unbuildable program.

Tokens can be separated by whitespaces. The concept of whitespace includes at compiler level things like keyboard spaces and comments.

The token can be of the following type:

4.1. Operators

4.2. Separators

4.3. Identifiers

4.4. Keywords

4.5. Constants

After the lexical analysis program consists of a sequence of tokens. The concepts with which you will work in the program will have names.

Preprocessor

C++ 1998 uses the C89 preprocessor, although the C language has undergone changes: Tradition C, C89, C95, C99, C11

Names Overloading

In C and other programming languages, the same identifier can be associated with more than one object at a time. This situation is called name overloading or name hiding.

In C ++ (Take look for example C++ 2003 Standart, chapter 13), this concept is introduced as in C language.

It is an error to create two declarations of the same name in the same overload class in the same visibility block or at the top level.

The extra rule for names overloading for C++

C++ introduces structure and union tags and enumeration names implicitly declared via typedef in the namespace "another" where in fact there also usual variables are locating.

However, these identifiers (tag names) can be hidden by subsequent variable or function declarations, or by an enumeration member of the same name in the same scope.

But if you explicitly use a typedef for a structure followed by a variable declaration, it will lead to an error.

It is very interesting that according to ISO / IEC 14882 C ++ 2003, 3.3.7. In any order, functions/variables take precedence over type tags.

About used memory for types

The representation of an object in memory is a sequence of bits. The representation does not have to include all the bits. But the size of an object is the number of units of memory that it occupies.

The amount occupied by one char character is taken as a memory unit. The number of bits in a character is specified in the CHAR_BIT macro.

All objects of the same type by C/C++ rules occupy the same amount of memory.

Computers are classified into two categories in the order of bytes in a word:

- Right to left, or little-endian - the address of a 32-bit word matches the address of its least significant byte (Intel x86, Pentium)

- From left to right, or big-endian - the address of a 32-bit word matches the address of its high-order byte (Motorolla)

Some systems support two modes at the same time.

In some computers, data can be located in memory at any address, in others, alignment conditions are imposed on certain types.

A typical data type to store pointers to some object/data is a pointer. To store (or serialize the value of pointer) in some integer variable, you can use intptr_t.

intptr_t integer type was introduced in C99, uintptr_t is sufficient to store a pointer to any data, but formally not to a function.

There is a special value in C called a null pointer that is equal to a null pointer constant. A null pointer can be converted to any other type of pointer.

About type conversions

First of all, let's say about literal constants. In C/C ++ in a literal expression, you can encode the type of a constant:

Soon we will go into technical details about type conversion. It's maybe hard to remember them, so possibly it is better to observe the big picture:

The general requirement when converting integer types is the mathematical equivalence of the source and target values.

Prohibited conversions

1. Converting a pointer to a function to a pointer to a data and to the other side too - is not allowed in C ++

2. Built-in conversion to a struct, the union is not allowed.

3. Also, C++ treats enum as distinct from each other and from integer type as well.

4. In C and C ++ implicit conversion from enumerated types to integer types are allowed.

5. In C implicit conversion from integer to enumerated types is also allowed because in C enum are C, but in C++ it is prohibited.

The sequence of type conversions rules in C/C ++

1. Trivial transformation. Conversion to identical types

2. If an overflow occurs during conversion to a signed type, then the value is considered overflowed and technically undefined.

3. If an overflow occurs during conversion to an unsigned type, then the final value is equal to the "unique value" mod 2 ** n of the result. When using two's complement, converting to/from signed to unsigned integers of the same size does not require any bit change.

4. If the final type is shorter than the original and both types are unsigned, the conversion can be performed by discarding the appropriate number of most significant bits.

The rule is also applicable to integer types in 2-s complement notation.

5. When converting from float values to int, the final value should be equal to the initial value if possible. The nonzero fractional part is discarded. The result is undefined if the value cannot even be approximated.

6. In C, conversion to floating-point types is possible only from arithmetic types.

In the case of conversion from double to float, the final value must be equal to one of the two values closest to the original value.

The choice of rounding is implementation-dependent.

7. If it is impossible to convert from double or int to float, if the range of the target double type does not match, then the value is undefined.

8. Converting a pointer to a function to a pointer to a data and to the other side too - is not allowed in C ++

9. Conversion from the type array of type T to a pointer to type T is performed by substituting the pointer to the first element of the array.

10. Similarly, there is a conversion from "function ..." to "function pointer ...."

11. A value of any type can be converted to void

12. Conversion to void * and back guarantee the restoration of the original pointer value

13. In C, void * can be implicitly converted to a pointer to any type; in C ++, an explicit cast is required. (Annex C, 4.10, C ++ 2003 standard)

14. On the operands of unary operations, ordinary unary conversions are performed. The goal is to reduce the number of arithmetic types.

* An array of type T => pointer to the first element (not applied for &, sizeof operators)

* Function => function pointer

* Conversions from an integer type of rank below int => to int

* Conversions from unsigned integer types lower than int, int represent all values => values are cast to integers

* Conversions from unsigned integer types lower than int, but int does not represent all => values are cast to unsigned int

15. On the operands of a binary operation, the usual unary conversions of are performed separately for each, and then the usual binary conversions.

16. If someone of the type long double, double, float, in the second rank lower, then it is cast to the type with highest rank.

17. If both operands are unsigned, then both are cast to a higher rank unsigned type

18. If both operands are signed, then both are cast in the signed type of the higher rank

19. Unsigned operand and lower-ranked signed operand => unsigned type

20. Unsigned operand and signed type operand of higher rank => signed type

22. If the prototype is controlled by an ellipsis, then the usual unary conversions are performed on the operands, and also float is always converting to double

23. If there is no ellipsis and the call is fully prototype driven, then "22" does not occur

Namespaces

The namespace is a mechanism for reflecting logical grouping. If some declarations can be combined according to some criteria, they can be placed in the same namespace to reflect this fact. Namespace advantages:

Logical structure reflection
Avoidance of name conflicts
Express a coherent set of tools
Prevent users from accessing unnecessary tools
Do not require significant additional effort when using

Disadvantages:

Waste of time analyzing the assignment of objects to different namespaces
Various additional nuances, such as:
- A local variable or a variable declared via using hides external variables in relation to the block of visibility
- When libraries that declare many names are made available through the using directive, it is important to understand that unused name conflicts are not considered errors
- Elements of the same namespace can be in different files

New operator (memory allocation)

In C ++, before the introduction of the exception mechanism, the new operator returned 0 when the memory allocation failed.

In the C ++ standard, new by default throws a bad_alloc exception. As a rule, it is best to strive for similarity to the standard. Better to modify the program to catch bad_alloc rather than check for 0.

In both cases, doing anything other than throwing an error message is not easy on most systems.

See paragraph 5.3.4. Subparagraph 13. http://www.ishiboo.com/~nirva/c++/C++STANDARD-ISOIEC14882-1998.pdf

or the Visual Studio compiler:

1. new (std :: nothrow) - does not throw an exception

2. regular new - throws an exception you can customize the behavior using linker options http://msdn.microsoft.com/en-us/library/kftdy56f.aspx

TODO: Continue transaltion

Exceptions

Когда программа конструируется из раздельных модулей и особенно когда эти модули находятся в независимо раздельных библиотеках удобно обработку ошибок разделить на две части:

Генерация информации о возникновении ошибочной ситуации, которая не может быть разрешена локально
Обработка ошибок, обнаруженных в других местах

Код обработки ошибок может быть короче и элегантней при использовании возвращающегося значения,

говорящего об ошибочной ситуации, но это решение плохо маштабируется.

Как правило разделение кода обработки ошибок и “нормального” кода является хорошей стратегией
Генерация исключения может оставить объект в недействительном состоянии
Генерация исключения может быть источником утечек памяти и других ресурсов из-за отсутствия закрытия ресурса.

Для правильного освобождения объекта лучше полагаться на свойства конструкторов и деструкторов и на их взаимодействие с обработкой исключений.(При выходе из блока с помощью генерации исключения происходит очистка всех созданных локальных автоматических объектов в обратном порядке создания.)

Также стандартная библиотека предоставляет шаблон auto_ptr. Написать правильный безопасный при исключениях код с помощью явных try может оказаться трудной задачей.

Замечания:

Можно группировать исключения, отношением наследования
Исключения в момент генерации копируется. Модификатор const в блоке catch ни на что не влияет. Однако влияет наличие сигнатуры типа T& или T. Последний приводит к отработке конструктора копирования. Для throw T() нельзя увидеть в VS 2012 отработку конструктора копирования. Можно увидеть для {T e; throw e;}
Возможно повторно генерировать исключение
При генерации исключения в конструкторе деструктор объекта не вызывается
Если во время вызова деструктора в процессе обработки исключения в деструкторе вызывается исключение – то это считается ошибкой механизма обработки исключения и вызывается std::terminate(). Для различия поведения можно воспользоваться в деструкторе вызовом uncaught_exception()
Выход из деструктора через генерацию исключения является нарушением требований стандартной библиотеки.
Если исключение сгенерировано, но не перехвачено вызывается std::terminate. Своё поведение можно задать посредством set_terminate.
The process of calling destructors for automatic objects constructed on the path from a try block to a throw-expression is called “stack unwinding.” [Note: If a destructor called during stack unwinding exits with an exception, terminate is called (15.5.1). So destructors should generally catch exceptions and not let them propagate out of the destructor. —end note] (C++ 2003)
std::exception, std::logic_error, std::runtime_error - основные ошибки
некоторые другие - bad_alloc, bad_cast, bad_typeid, bad_exception, out_of_range, invalid_argument, overflow_error, ios_base::failure
Если функция пытается кинуть исключение, который она не задекларировала это приведёт к вызову std::unexpected, которая по умлочанию дёргает std::terminate (p.429, Страуструп, спец. издание)
int f(); /* Может сгенерировать любое исключение */
int f() throw(); /* Не генерирует исключений */
int f() throw(x2, x3); /* Генерирует только исключения x2, x3*/

Приоритет перегрузки функций/операторов

Приоритет перегрузки функций

Точное соответствие типов, или соответствие достигаемое тривиальным преобразованием (имя массива в указатель, имя функции в указатель на функцию, тип T в const T).
Соответствие достигаемое путём продвижения интегральных типов и продвижение действительных чисел (char в int, float в double)
Соответствие достигаемое путём стандартных преобразований (int в double, double в int), указатели на производные в указатели на базовые класса. Указатели на произвольные типы в указатели на void*
Соответствие достигаемое при помощи преобразований, определяемых пользователем
Соответствие за счёт многоточия ...

Если соответствие может быть достигнуто двумя способами на одном и том же уровне критериев, вызов считается неоднозначным и отвергается.

При перегрузке шаблонных функций ищется набор подходящих специализаций согласно шагам (1-4), указаным ниже

Ищутся все специализации, которые потенциально могут быть вызваны
Если какая-та специализация является более специализированной из двух, то менее специализированная отбрасывается
Разрешается перегрузка для функций с шага 1-2 и обычным функциям
Если обычная и шаблонная функция подходят одинаково хорошо приоритет отдаётся обычной функции

Если функции прошедшей 1-4 не найдено вызов считается ошибочным

Правила разрешения перегрузка бинарного оператора x@y. Где x имеет тип X, и y имеет тип Y

Если X есть класс. Выяснить определяется ли operator@ в качестве члена класса X или базового к X классу
Посмотреть объявление operator@ в контексте выражения x@y
Если X объявлен в пространстве имен N, поискать объявление оператора в пространстве имен N
Если Y объявлен в пространстве имен M, поискать объявление оператора в пространстве имен M

Правила поиска пространства имён

Пространство имён - именованная область видимости. Пространство имён в отличие от определения класса открыты на добавление новых функций в него.

using директива больше применяется к пространствам имён, чем к классам.

1. Если функция не найдена в контексте её использования, то производится попытка поиска в пространстве имен аргументов. Это правило не приводит к загрязнению пространства имен.

namespace NameSpace { struct Type{}; void func(Type x) {} } ... func(NameSpace::Type());

Называется это Argument-dependent lookup (ADL) Она очень полезна например при вызове некоторого переопределенного оператора для вашего типа в той ситуации когда вы решили определить оператор в том простарнстве имён в котором находится ваш тип.

Некоторые нюансы с которыми я столкнулся: https://stackoverflow.com/questions/45713667/unqualified-lookup-in-c

Различные нюансы работы с пространствами именами освящены в дополнении [1] B.10, стр. 924

2. Локально объявленное имя и имя объявленное с помощью using-директивы скрывает нелокальное объявление

3. Локальное объявление имени имеет приоритет над именем из NS, но глобальное объвление не имеет приоритет над переменными заимпортироваными из NS::*

4. Конфликты неиспользуемых имен не рассматриваются как ошибки

5. Глобальные имена находятся в "глобальном пространстве имён". Это просто глобальный уровень. Он отличается от всех пространств имён (в том числе и от безымянного). Отличается глобальное пр-во имён от определяемых через namespace только тем, что не надо писать его имя. Реально про него стоит задумываться только когда понадобилось использовать ::global_var, когда у вас проблема #3. :: - расширение области видимости. С помощью такой конструкции вы всегда в первую очередь будете смотреть сначала в глобальном пространстве имён, а потом в пространствах имён заимпортированными в глобальном пространстве имён.

6. Если имя объявлено в охватывающей области видимости или в текущей области видимости, то это имя можно использовать без проблем, без полного квалификатора.

7. Непрерывное повторение квалификатора отвлекает внимание. Многословность можно устранить using объявлением:

7.1 Созданием синонима на переменную через using NS::x; (using-объявление)

7.2 Создание синонимов на все переменные из NS - через using namespace NS; (using-директива)

8. Размещая 7.2 внутри другого NS открывает путь комбинировать/замешивать функции из разных пространств имён

9. Создание не именованного пространства имён подразумевает автогенерацию его имени компилятором, и вставку using namespace GEN_NAME; в исходный файл с этим не именованным пространством имен.

10. Имена явно объвленые в пространстве имён, а также сделанные доступными с помощью using объявлений т.е. using NS::x;

имеют приоритет по отношению к именам, сделанных доступными через using директивы т.е. using namespace NS;

11. Пространство имен могут быть вложенными. Для создания псевдонима можно использовать конструкцию типа

namespace AA = NameSpace::NameSpace2;

Ключевое слово typename

Ключевое слово typename должно использоваться в трех задачах

1. Замена ключевого слова class на слово typename в декларации типов аргументов для шаблонного класса/функции/метода.

template <typename> struct S{}

2. Обращения к именам типов через область видимость класса, являющегося аргументом шаблона

template <class T> struct S { typename T::SomeType a; }

Комментарий Б. Страуструпа "В некоторых случаях умный компилятор мог бы догадаться...Но в общем случае это невозможно"

3. Ключевое слово typename необходимо, если имя типа зависит от выбираемого параметра шаблона

template <class T> T findMax(const std::vector<T>& vec){ typename std::vector<T>::const_iterator max_i = vec.begin(); for (typename std::vector<T>::const_iterator i = vec.begin() + 1; i != vec.end(); ++i) { if (*i > *max_i) max_i = i; } return *max_i; }

Еще очень подробное описание имеется здесь про ключевое слово typename.

http://stackoverflow.com/questions/610245/where-and-why-do-i-have-to-put-the-template-and-typename-keywords

Порядок отработки инициализации объекта класса в C++

C++ Standard - ANSI ISO IEC 14882 2003; 12.6.2.*, p.230 - порядок инициализации классов и отработка конструкторов.

1. Depth first, left to right отработка конструкторов совсем базовых виртуально наследуемых классов

#include <stdio.h>

class A {

public: A(){printf("A()\n");}

};

class B:virtual public A {

public: B(){printf("B()\n");}

};

class C: /*virtual*/ virtual public A {

public: C(){printf("C()\n");}

};

class Branch {

public: Branch(){printf("Branch()\n");}

};

class D : public Branch, virtual public B, virtual public C {

public: D(){printf("D()\n");}

};

int main()

{

D d;

return 0;

}

Как не удивительно, но первым будет выведено не Branch(), а вывод будет таким:

A()

B()

C()

Branch()

D()

2. Отработка конструкторов непосредственно базовых классов в порядке слева направо из описания класса (а не в порядке указанном в списке инициализации)

3. Инициализация членов класса в порядке указанном в описании класса (опять же не в порядке указанном в списке инициализации)

4. Отработка тела конструктора

При отработке деструктора происходят идентичные действия, но в обратном порядке:

4. Отработка тела декструктора

3. Деструктор для членов класса

2. Деструктор непосредственных невирутальных базовых классов

1. Если объект является объектом наиболее "глубокого" класса в графе наследования то вызываются деструкторы вирутальных баз

Удаление объекта с incomplete type

class ptr

delete ptr; <<< undefined behaviour for incomplete types

Для POD, и для объекта без деструктора будет выполнено что-то вроде C-rutime free, которому не требуется знание о размере объекта. В этом случае вам повезло, и ваш delete ptr сработает, но в общем случае это приводит к undefined behaviour. (5.3.5 Delete C++2003)

Return Value Optimization

http://alenacpp.blogspot.ru/2008/02/rvo-nrvo.html

Порядок поиска инклудов

Оригинальная спецификация С считает что для поиска включаемого пользовательского файла используется изначальная директория в которой находится компилируемых файл
Реальность такова, что надо уточнять правила по спецификации используемого toolchain-а. Или понимать суть по экспериментам
Б. Страуструп, Спец. Издание, стр. 487, 16.1.2: "....Стандартный заголовочный файл имя которого начинается с буквы c эквивалентен стандартному заголовочному файлу в библиотеке C. (стр. 247, 9.2.2)

Видов инициализации в C++ большая куча

const int v1(expr); // direct initialization

int v2=expr; // copy initialization

int arr[]={1,2,3}; // bracket initialization

Предопределенные идентификаторы и макросы

__func__ - В C99 предопределенный идентификатор с именем текущей функции.
1. __LINE__, __FILE__ - номер текущей строки, и имени текущего исходного файла
2. __DATE__, __TIME__ - дата и время трансляции файла
3. __STDC__ - компилятор соответствует стандарту C

Приоритеты операторов C++

Вырезка из книги Б. Страуструпа "Язык программирования C++. Специальное издание"

Memory Types

2.13.4.1. / 1 An ordinary string literal has the type “array of n const char” and static storage duration

Про виды функций

http://stackoverflow.com/questions/5500057/how-to-define-a-function-pointer-pointing-to-a-static-member-function/30949326#30949326

5.2.2 There are two kinds of function call: ordinary function call and member function call -- (57) (9.3)....

57) A static member function (9.4) is an ordinary function. (С++ ISO/IEC 14882 2003-10-15)

Чего нельзя делать

Нельзя модифицировать переменную более одного раза без точки следования в C++03/98.

Точка следования (sequence point) -- точка с запятой, возврат из функции, прыг в функцию, и другие.

В C++11 появляется термин order of evaluation. Термин sequence point уже не используется.

Отсутствие инструкции return в теле функции эквивалентно явной инструкции return; в конце тела функции.

return; разрешается использовать в функциях с типов возвращаемого значения void, в конструкторах/деструкторах.

С++2003, 6.6.3 2:

"Flowing off the end of a function is equivalent to a return with no value; this results in undefined

behavior in a value-returning function"

Miscellaneous

1. Про указатель на константные данные и константный указатель

[2], стр. 105

В принципе возможный трюк чтобы это запомнить - читать выражение справа налево и к кому ближе "const" тип или переменная к тому и относить этот модификатор.

int* const const_pointer;

const int* pointer_to_const;

2. Про <cmath> или <cstdio>

[1], 9.2.2, стр. 247. Для каждого заголовочного файла стандартной библиотеки "C" "X.h" имеется соответсвующий стандартный заголовочный файл С++ <cX>.

"X.h" определяет имена функций в пространстве имён std, а так же импортирует эти имена в глобальное пространтсво имён.

[1], стр.487

"сX" определяет имена функций только в пространстве имён std

3. Популярные исключения из one defintion rule

[1], 9.2.3 стр. 248

Два определения

-- класса

-- шаблона

-- встроенной функции

Приемлемы когда:

1. Они находятся в различных единицах трансляции

2. Они идентичны лексема за лексемой

3. Значение лексем одинаково в обоих единицах трансляции (Проверка этого не входит в возможности самого языка программирования, а возлагается на интрументарий)

4. Все исключения из One Definition Rule

https://stackoverflow.com/questions/17667098/inline-template-function

C++2003, p.23, §3.2/5

There can be more than one definition of following things if they appear in different tranlation units:

1. class type (Clause 9)

2. enumeration type (7.2)

3. inline function with external linkage (7.1.2)

4. class template (Clause 14)

5. non-static function template (14.5.6)

6. static data member of a class template (14.5.1.3)

7. member function of a class template (14.5.1.1)

8. template specialization for which some template parameters are not specified (14.7, 14.5.5)

5. Правила поиска пространтсва имён

Some relative things are here: https://en.cppreference.com/w/cpp/language/unqualified_lookup

Example with a variable "i" in ANSI ISO IEC 14882, C++2003, 3.4.1, (6) (page 30).

namespace A { namespace N { void f(); }}void A::N::f() { i = 5;// The following scopes are searched for a declaration of i:// 1) innermost block scope of A::N::f, before the use of i// 2) scope of namespace N// 3) scope of namespace A// 4) global scope, before the definition of A::N::f}

One subtle addition to function-names:

Function names obtained from ADL are looked up in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified name lookup.

Page updated

Google Sites

Report abuse