c_basics - RicoJia/notes GitHub Wiki
========================================================================
========================================================================
-
Endianess
- Origin: Gulliver's Travels(格列夫游记)whether should we put the large digits to high bits (Big endian) or low bits in a register:
int 123 -> 1-2-3 or 3-2-1?
-
Style
- contains: Author, Purpose, Usage, References,
- You spend more than half of your time maintaining your code, so be clear
- OOP, structured programming, goto-less programming.... These are "cults"
- goto is useful for breaking out of nested for loops, without using too many flags
- avoid side effects, use ++ or -- on separate lines
- the law of least astonishment: no surprise = the best
- surround macros with {}
#define DIE(msg) {printf("eheh"); }
- use
#ifdef, #undef
at the top of the program
-
A program can launch multiple processes
- In Unix systems, each program has an initial process
- Then you can launch "clone" of the original process
-
C comes from Free Software Foundation, distributing it to computers
-
software has a life cycle, it dies, and gets replaced by a younger product
- write a project description, then show it to your boss and colleague (like a google doc)
- build a minimal prototype
- Understanding someone else's work is the hard part.
- Initial meeting: which files, what they do on a high level, how to build and launch, how to test (if there's a recommendation)
- cross reference, indenters are great tools.
- taking notes is very important.
- need to learn some debugger
- When you work on new functions, start from the new functions, understand how it works, then work your way up
- actually starts processor commands, can be used to edit something other than c.
-
Floating point
- floating addition requires floating nums to shift. floating numbers' multiplication do not require them to shift. So they take about the same amount of time
- storage and calculations may have different accuracies.
- floating pointer arithmatic is required to be in double by c
- sine values might have bad precision at some angles.
-
Files
- printer, terminals are files too.
- each file is treated as a series of bytes
-
expose: add an interface to manipulate objects.
-
optimization might be expensive: and can bring only 10-20% speed up.
- register: for fast access. Unit has 12 registers.
register int i;
- shift instead of multiply
a <<2;
- for nested loops, have the simple function in the inner loop
- the ultimate optimizations: floating pointer acclerator -> high speed RISC computer -> macros -> assembly directly.
- instead of having ascii files, have binary files
- Ascii is like web pages. binary files: images
- register: for fast access. Unit has 12 registers.
-
Compiler
- compiler first has a lexical analyzer, spitting out tokens and operators for words.
-
OS will free all memory used by a program, even there's memory leak
-
To see files in hex, use xxd, or hexedit
-
ALWAYS KNOW WHAT THE REQUIREMENTS ARE
- ask boss what the final form is.
- Ask him what other potential features they'd like to add, to reduce the amount of "oh could you please add this? could you please change this" etc.
- Know what matters to development, what doesn matter
-
Design a minimalist development track
- Know how to benchmark it?
- Make your input & output exactly like the benchmarks, very easy to benchmark
- like the histogram count cuda project, libjpeg doesn't yield the same result as np.histogram(), and we can just output the same array as libjpeg.
-
Always fast prototype for something small:
- Try using
gcc, nvcc
for compilation (instead of cmake) - Use existing library to avoid complex libs (may install them)
- Try using
-
Compilation:
- Compilation: process
-
setbuf(stdout, NULL)
: no buffering printing -
pause()
puts a thread to sleep until signal arrives,sleep(1)
will sleep for a while - Signals: good read
-
SIGUSR1
doesn't have a pre-assigned keybinding to it, so can be sent by another user process - example
-
-
Experience: for large project, it's important to have a struct with all params!
-
Copying and pasting can be detrimental: if there's something haven't changed, it can be long to debug, like the cmake command for copying: target has been copied will yield another empty copy, strange to debug.
-
reproduce the bug + locate it (90% of the time)
-
talk to other ppl will help you find the bug
========================================================================
========================================================================
-
printf
- %g is for scientific numbers
- printf loads args like a stack (same as cout)
int i = 1; printf("%d, %d", i, i++); // see 2,1
- printf does not check if args are supplied, and it will make one up if nothing is supplied
printf("%d, %d"); // see garbage values
- printf("%p", ptr), print address stored in a pointer
-
argc (num of args), argv (args themselves, the first arg is the name of the program);
while ((argc > 1) && (argv[1][0] == '-')){ switch(argv[1][1]){ case 'v': //DO_STUFF break; case '0': //do stuff break; default: break; } --argc; ++argv; // move onto the next arg }
-
fgets + sscanf
- read in an array and parse an array using scanf
char arr[15]; fgets(arr, 15, stdin); //stdin is the keyboard int count, total; sscanf(arr, "%d %d", &count, &total); //sscanf just parses the arr
- don't use scanf cuz it's fussy about '\n'
- Initialize char arr:
char arr[] = "123"; //method 1 char arr2[12]; #include <string.h> strcpy(arr, "lol");
- fgets automatically has a '\n' at the end, so STRIP OFF \n if you need to
- sscanf has a sister function: fscanf, just scans thru files.
- read in an array and parse an array using scanf
-
'\n' vs '\n\n'
- UNIX uses
'\n'
(called ), where MS-DOS uses'\n\r'
( ), apple uses<RETURN>
- Before computer, there's a tele-typewriter. It takes two unit time to switch lines, where 1 unit time is for typing a char
- After computer, ASCII chooses to have just one . UNIX
- UNIX uses
-
logging
-
fprintf
usage, note you DON't NEED to concatenate strings, just copy and paste!FILE * pFile; pFile = fopen ("myfile.txt","w"); fprintf (pFile, "Name %d [%-10.10s]\n",666,"lol"); // fprintf (stdout, "Name %ld [%s]""ll""asdfa",666,"lol"); //Name 666 [lol]llasdfa
- a simple macro for logging: note
##
is an old special token that's only valid after,
#define CYN "\x1B[96m" #define LOG(fmt, arg...) fprintf(stderr, CYN "haha "fmt,##arg); LOG("status : %d", 123);
- Other ways for variadic
- macro:
#define BYTE_TO_BINARY(byte) ((byte & 0x02) ? '1' : '0'), ((byte & 0x01) ? '1' : '0')
-
-
File IO 1.
fopen()
flags: -w
: create file if not there; will overwrite -b
means open the file as a binary file -
format a string using
std::sprintf
:int main(){ // sprintf for string formatting. Restriction: preallocate arr! char arr[10]; float num = 1.2345; sprintf(arr, "he:%06.3f", num); printf("%s", arr); }
========================================================================
========================================================================
- Misc
- pointers are also calledscript callbacks
- sizeof
- sizeof(ptr) gives the number of bytes of each datatype
- pointer arithmatics
int *p = arr + 2; //second element of the array p - 2 //returns 2.
- Null pointer (C and C++)
- (int*)0 is a constant expression whose value is a null pointer
- 0 or (void*)0 in C are null pointer constant.
- Null pointer constant can be assigned to any data type lvalue, but the constant expression with a null pointer can only be assigined to compatible types.
- NULL is a macro
long *a = 0; // ok, 0 is a null pointer constant long *b = (long *)0; // ok, (long *)0 is a null pointer with appropriate type long *c = (void *)0; // ok in C, invalid conversion in C++ long *d = (int *)0; // invalid conversion in both C and C++, long and int, different types!!
- Pointer Casting:
char i = 1;
int *p = &i;
printf("%d", %p); // We're forcing the compiler to read i as an int!
-
Void Pointers
- Motivation: void pointer can hold address of any type of pointer. However, they must be used after type-casting
int a = 1; void* ptr = &a; printf("this is an interger: %d", *ptr); // this doesn't work printf("this is an interger: %d", *((int* )ptr)); // this does work double b = 1.0; ptr = &b; printf("this is a double: %f", *((double* )ptr));
- Malloc, Calloc
#include <stdlib.h> int* x = malloc(sizeof(int)*5); //array of 5 elements, but won't compile in C++. Malloc returns void* int* x = (int*) malloc(sizeof(int)*5);
- good practice for malloc:
b.ptr_ = nullptr; b.ptr_ = malloc(sz); if(!b.ptr_) {
- Limitations
- pointer arithmatic is not possible because we don't know its concrete size.
- Cannot be dereferenced
- Uses
- Even though you can't dereference void*, you can still access its members directly
typedef struct student { int id; }student; student* student_1; void* data = student_1; data -> id;
- Motivation: void pointer can hold address of any type of pointer. However, they must be used after type-casting
-
Function pointers
- Basics
void (*fun_ptr)(int) = &FUNC; //we need () for *fun_ptr, else it will become a function void (*fun_ptr)(int) = FUNC; //also works void (*fun_ptr[])(int) = {func_1, func2};
- No need to allocate/deallocate function pointer.
-
void*
is used for generic programming in C. It can provide a callback signature to upper level applications, to avoid code redundancy. This is called "responsibility delegation"- The key is library functions should take in void*, and returns void*, the application should use type cast for specific applications
//app.c #include <stdlib.h> //qsort int compare(const void* a, const void* b){ return (*(int*) a - *(int*) b); //of course you need type casting } int main(){ int arr[] = {1,2,3}; int n = sizeof(arr)/sizeof(arr[0]); qsort(arr, n, sizeof(int), compare); //generic programming }
- Basics
-
Double Pointers
int i = 1; int * const p = &i; int * const *pp = &p;
- int* const* int_double_ptr
- const here means const ptr, which means the address pointer points to cannot be modified.
- int* const* int_double_ptr
-
Cautions
- making a local variable with the same name might cause
segfault
Node node = node->next(); //the other node is a global variable
- making a local variable with the same name might cause
========================================================================
========================================================================
-
Static
- static has three meanings:
- static functions can only be used in the current translation unit
- it won't be exported to .o file
- in a function, variable is allocated from static memory
- static variables are exclusive to a function, but not the thread.
-
AND accessible only within that translation unit
- So if your header includes the same global varable, you'll have to change names.
- A single translation unit is any preprocessed source file with included headers, i.e, can be compiled into an object file, library, or executable program.
-
AND accessible only within that translation unit
- static functions can only be used in the current translation unit
- static has three meanings:
-
volatile: value can be changed without any code of yours changing it. anytime.
- So optimization is omitted.
- Compiler will copy the value from a memory to a register, and keep accessing the register. Volatile prevents this.
- such as if two values are stored in a row, the compiler must store the values twice, instead of once.
i = 2; i = i; // must store twice
- So optimization is omitted.
-
define
-
#define
vsconst
- const is always better than
#define
, but old code uses a lot of define - const can work with any type, like struct. #define can work with only basic types
- Except #define is essential for conditional compilation.
- const is always better than
- Conditional Compilation: debug using #define
#define DEBUG #ifdef DEBUG printf(); #endif
- or create DEBUG using
cmake -DDEBUG
- or create DEBUG using
- Comment with /* /: doesn't work with another / */
-
So better alternative is
#if 0 #end
-
So better alternative is
- Force remove
#define
, so#undefine
will not be effective#undef DEBUG
- [design pattern]: create
const.h
to store all consts.#ifndef __CONST_H__ // checks if const.h has been included somewhere #define ... #endif
- technically you can do
#define count int count flag;
-
-
Auto (TODO)
-
pragma
- What is it? TODO
-
#pragma
is for compiler dependent commands. -
pragma startup FUNC1
Specify FUNC1 that need to run before program start up (before control passed to main) -
pragma exit FUNC2
FUNC2 just before the program exits (control returned from main)
-
- visibility
- use it to declare functions only within the scope of a shared library
#pragma GCC visibility push (hidden) some_code_hidden from the callers #pragma GCC visibility pop
- So the dynamic symbol will become smaller. Something for optimization
- Don't use it with exceptions
- Don't put #include in it
- http://hirntier.blogspot.com/2008/09/gcc-visibility.html
- use it to declare functions only within the scope of a shared library
- What is it? TODO
- comma:
c = (a,b); //equivalent to a; c = b; ptr = &(a, b) //a; ptr = &b;
- assignment
=
- returns the lhs value after evaluation
(a = b); //return a; // Expert Example while ('\n' != (*p++ = *q++)); // quivaluent to *p = *q; ++p; ++q; while ('\n' != *p){ *p = *q; ++p; ++q; }
- You can assign a new variable in while loop:
while (node = (node->next_).get()){}
- returns the lhs value after evaluation
========================================================================
========================================================================
- Basics
- char array needs to have '\n'
- those created by string literals will automatically have '\n'
- Operations
- print string:
char string[] = "1 233"; printf("%s", string);
- you get 0.0 if you do
char str[2] = "hh"; printf("%.1f", str);
- string functions:
- strcpy: have to use this to copy a string literal over.
char arr[15]; strcpy(arr, "lolol");
-
strcmp(char*, char*)
- <string.h>
- return 0 if 2 strings are identical, else, return > 0 or < 0 for the first different chars chars
-
strncmp(char*, char*, size_t)
- pretty much the same as strcmp, but we just look at a portion of the string.
- strncmp caution: returns 0 if two strings are the same
-
array initialization
- Designated Initialization: only in C, not in C++
int arr[2] = {[0]=1, [1]=2};
- Designated Initialization: only in C, not in C++
-
2D array initialization
const char* str_arr[] = {"lol", "hehe"};
- or, you can write it this way:
What is this syntax for initializing 2D array?
int aaa[3][3] = {[0] = {6,2,3}, [1]={2,3,4}};
- or, you can write it this way:
-
string literal is allocated on stack,
- char [] has the copy of the whole string literal as a char array
-
char[] arr = "SOMETHING"
is modifiable, because it's a copy
-
- char [] has the copy of the whole string literal as a char array
-
Change a string:
char* str="something"
stores the address of the first char of the string literal (so read-only) in char*.- so you can't directly modify the string literal str points to.
- so
*str = 'a'
; will fail, since it points to read-only memory.
- so
- But you can change the address of string literal that
str
stores. - The solution:
//void getStr(char* str_copy); //creates a copy that stores the same address to the string literal,but we want to change str itself! void getStr(char** str){ *str = "GFG"; } int main() { char* str = malloc(5); char* str = "lol"; // This is string literal, but you have a memory leak here. as str now points to something completely different. getStr(&str); cout<<str; }
- You may succeed with
char str[] = "something"; str[1] = 'a'; // C++ is lenient, and let you pass in the address as non-const\*
-
The best way to create a string is to use strcpy
strcpy(str, "this is a test");
- so you can't directly modify the string literal str points to.
-
Passing char* out of a function is safe, but char[] is not, since it contains a copy of the str on stack
char* getStr(){ char* str = "GFG"; return str; //Safe } char* getStr2(){ char str[] = "GFG"; return str; //not safe. }
- printf("%u") is to print unsigned int
- For int types,
printf("%8u", i); //if i shorter than 8 chars, the front will be padded with blank; else, it won't be trucated printf("%08u", i); // pad 0 if there's not enough 0
- For float types,
printf("%4.2f", 32.145); //get 32.15 printf("%4.10f", 32.145); //10 digits after decimal point will be printed no matter what and no spacing at front. 0 will be padded. Nothing will be truncated, get 32.145000..0
- combine two bytes into one
uint16_t value = (highByte << 8) | lowByte;
- on most windows, int is 16 bits, but on POSIX, 32 bits.
- For int types,
- converting float to unsigned int:
union { float f; uint32_t u; } f2u; f2u.f = 1.1 uint32_t out = f2u.u;
- Reason is: copying no-negative float to uint is "order-preserving" on bits.
- if you work on x86, ARM, you need to swap the bits as well
- floats ar stored as magnitude and sign bit, int is two's compliment. So need to flipthe non-sign bit.
- converting float to string:
#include <stdio.h> int main() { float f = 1.123456789; char c[50]; //size of the number sprintf(c, "%g", f); printf(c); printf("\n"); }
-
_Bool On = 0; // it's not false.
- _Bool is part of C99.
-
bool, true, false
are defined in<stdbool.h>
, so may need to explicitly include it.
- Basics
- can be thought of as a "multi-purpose variable". A union can have multiple members, but only one member can have a value at the same time.
- The other ones don't really have values (garbage value),
- But vars of the same type will be altered too
- These variables have the same memory location.
- The size of the variable is the largest member.
- E.g,
union Test{ int x,y; //the size is 4 bytes, since int is the largest type here. char z; }; int main(){ union Test t; t.x = 4; //This will make y 4 as well, but z is garbage t.z = '3'; //x,y will have invalid values, z=3. t.y = 100; //now x will be 100 as well. union Test t2 = {2,4,'a'}; //only the first member gets initialized to 2. }
- Uses:
- Use union when you just need one var at a time.
- can be thought of as a "multi-purpose variable". A union can have multiple members, but only one member can have a value at the same time.
-
Creation
- basic structure
#include <stdio.h> // 1. use typedef, - you're just adding a name here, not to replace the original one! typedef struct Foo{ int i; double j; } Foo; // 2. use struct struct Bar{ int i; }; int main() { // 1. use typedef, and designated initialization Foo f = {.i=1,.j=2}; // 2. use struct, and list intiailization struct Bar b = {3, 9}; // designated initialization int arr2[6] = {[4] = 29, [2] = 15 }; // standard initialization int a[6] = { 0, 0, 15, 0, 29, 0 }; printf("%f", f.j); return 0; }
- caution about typedef:
```c
typedef struct node{
struct node* prev_; // You need struct, struct is not fully declared until it gets defined.
}Foo_node;
Foo_node fn; struct node fn2; // also valid ```
-
Anonymous structure: no structure name, so there's only one struct object ```c structure{
} VAR_NAME ```
-
designated initialization
- Applied on aggregates (arrays, unions, structs)
- e.g.
typedef struct Foo_{ int a; int b; }Foo; Foo f = {.a = 1, .b = 100};
- basic structure
-
bit fields: code to extract bit fields is huge, so don't use this unless storage is a problem
typedef struct{ int i : 1; int parity: 1; int error: 1; //1 bit assgined in the same register }hee;
-
Cautions
- typedef name and the struct tags are slightly different: this is an old convention, where typedef and the struct names are different.
- _t means type, _s means structure
========================================================================
========================================================================
-
Malloc vs Calloc
- Difference:
- Malloc only allocates memory block, no initialization. Calloc allocates memory block with initialization
- Interface is different
int* arr1 = (int*) malloc(5 * sizeof(int)); //preferred over calloc because it's faster memset(arr, 0, 5); //initialize using memset. int* arr2 = (int*) calloc(5, sizeof(int)); // interface is different, also, it initializes 0 to it free(arr1); free(arr2);
- Similarities:
- Both return void*/NULL, but can you don't need to cast them explicitly.
- Need Free()
- Difference:
-
Object created by malloc are in consecutive memory
typedef struct Obj{ char name[80] unsigned int salary }Obj;
-
Free
- used on malloc, calloc.
- Good practice is to use it in the same function as malloc, calloc. In C++, RAII is a great standard to follow
- You can't see if a pointer has been freed, but ppl usually do
ptr=NULL
-
memory manipulation
#include <string.h> memset(void* dest, int c, size_t n); void* memcpy(void* to, const void* from, size_t size);
- memset
- set byte by byte to a value. So, two -1 or 0 (0xFF) works for one uint16_t(0xFFFF)
char str[20]; #include <stdio.h> memset(str, -1, sizeof(string));
========================================================================
========================================================================
- Basics:
- It's allows user defined function with the same name, like overloading.
- You define your function without
__weak
and your function will alwasy be taken
// E.g 1 void USART1_IRQHandler (void) __attribute__ ((weak, alias("Default_Handler"))); void Default_Handler(void) { while(1); } void USART1_IRQHandler(){ ... } // E.g 2 __weak void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart) { /* NOTE: This function Should not be modified, when the callback is needed, the HAL_UART_RxCpltCallback could be implemented in the user file */ } void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart){}
- Basics
- The preprocessor language is not c, so don't use = or ;!
- single line
- multi line
#define Bar(x) {\ // need {} to enclose the macro! x + 1; x + 1; // This is simply text substitution } \
========================================================================
========================================================================
- print time in timestamp (uint64_t)
#import <time> (uint64_t)time(NULL); // time in seconds elapsed from 1-1-1970
========================================================================
========================================================================
-
How malloc works
- You may have around 6GB physical RAM
- If you just do malloc and not using it, OS will allow you to ask for as much as possible
- But if you use it, memset, then you can only use around 6GB, (or a bit more as virtual memory can use some disk space)
-
Memory Mapped IO
- On uC, there're two ways to interface with registers:
- Port mapped (use special CPU instructions)
- Memory Mapped
- More convenient
- each device register is assigned to an address in virtual memory
- Illustration
- Memory Map of STM32
- On uC, there're two ways to interface with registers:
-
Memory Map of Linux
- Heap starts from the bottom, till program break
- When you do malloc, top of heap goes up.
- When you do free, top of heap goes down
- Two functions that change the program break:
-
not recommented for personal use, cuz they're for implementing malloc
#include <unistd.h> brk(void *addr); #push program break to a location void *sbrk(intprt_t increment) // specify by how much you want program break to move up, returns the address of the previous break.
-
not recommented for personal use, cuz they're for implementing malloc
- sbrk may effectively move by 4096 bytes, Cuz virtual memory using paging, size of one page is 4096
-
mmap
#include <stdio.h> #include <unistd.h> #include <sys/mman.h> int main() { #define PAGESIZE 4096 unsigned int* first = mmap((void*)0xFEEDBEEF, PAGESIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); //change 0xFEEDBEEF to 0 or NULL means "I don't care where you put this memory" // PAGESIZE is how much memory you're asking for // we're allowing read and write, PROT_READ | PROT_WRITE // We're not sharing, but private to the process MAP_PRIVATE | MAP_ANONYMOUS // The beginning of the block has to be the top of a page. So it will try to allocate the // address we requested. printf("Address %p", first); }
-
open a file
fd_ = open(loc_.c_str(), O_RDWR | O_CREAT, (mode_t)0600)
- If fails, make sure you have the directory created already
- ctrl-c has to do with signal in Unix
-
signal(SIGPIPE, SIG_IGN)
:#include<signal.h>
========================================================================
========================================================================
- type don't match (with delctype...): (no return type, ooops)
- this pointer: not available in constructor? (yes, use it on intialized items. derived class members, nope.)
-
std::function
needs this pointer? No, we can simply do if (!function_ptr) - std::bad_alloc
- For emplace_back, If reallocation fails bad_alloc exception is thrown.
- check if the list itself is valid. if there's type casting, vec.size() doesn't really solve the problem
- segfault
printf(1); //print a number in printf, unlike in C++, it gives you segfault
-
!2 = 0
because this is a bool - don't modify global variables outside
main
, the program is not running yet, and no variables are actually created. so you can only declare them.int i = 3; i = 4; int main(){ }