Contents (click to expand)
.
First, create some fake data.
. clear. set obs 1000 obs was 0, now 1000 . g x = 1.1. list in 1/5, noobs +-----+ | x | |-----| | 1.1 | | 1.1 | | 1.1 | | 1.1 | | 1.1 | +-----+ . count if x ==1.1 // zero matches!! 0
Stata isnt wrong, it's just that you stored the variable x with too little precision (some decimal numbers have no exact finite-digit binary representation in computing). If we change the precision to float or store the variable as double format then it fixes the issue. Note below how x is represented in Hexidecimal and Binary IEEE format vs. Stata general (16g) and fixed (f) format.
. . count if x == float(1.1) 1000. . . **formats . di %21x x //hex+1.19999a0000000X+000. di %16L x //IEEE precision000000a09999f13f. di %16.0g round(x, .1) 1.1. di %4.2f round(x, .1)1.10. di %23.18f round(x, .1) 1.100000000000000089.
Storing the variable (now x) as double format fixes this issue. You could even change all default variable storage to double, however it'd make your dataset bloated and it's usually unnecessary - you really only need to change variables that require full precision or are being displayed in a table/graph.
. . g double y = 1.1. count if y ==1.1 //works now. 1000
Let's look at how to deal with stored results on the fly. The hackish/kludgy solution we have used previously was to convert it to a string and take the substring to truncate the value. This is not ideal.
. . . . . . g z = 999/_n. qui su z, d. di `"`r(mean)'"'7.477985390007496. di `"`=round(`r(mean)', 1.1)'"'7.700000000000001. di `"`=substr(`"`=round(`r(mean)', .01)'"', 1, 4)'"' //kludge using str7.48
Instead, we should use one of the solutions below. These include using the extended macro function 'display' to properly format and / or round these values (SOLUTION 1) or create variables with proper display format (think of display format like a 'mask' over the true (and accurate) stored value) (SOLUTION 2).
. . . **SOLUTION 1: use extended function format**. qui su z, d. di `"`r(mean)'"'7.477985390007496. local r:display %3.2f `r(mean)'. di `"`r'"' //use stored result7.48. local r:display %3.2f `=round(`r(mean)',.01)'. di `"`r'"' //use calculated/rounded result7.48. g mean = `r(mean)'. local r: display %3.2f `=mean'. di `"`r' vs. `=mean'"' //use stored variables7.48 vs. 7.477985382080078
.
. . **SOLUTION 2: create precise, formatted variable or scalar**. qui su z, d. g double p1 = `r(mean)'. di %3.2f `=p1[1]' //display without macro extension7.48. . l p1 in 1 +-----------+ | p1 | |-----------| 1. | 7.4779854 | +-----------+ . *fix display format:. format p1 %3.2f. l p1 in 1 //fixed +------+ | p1 | |------| 1. | 7.48 | +------+ .
Instead of macros or variables, we can also work with lightweight -scalar-s to get the same result.
. *note:. scalar s1 = `r(mean)'. di %3.2f s17.48. di s17.4779854. assert `=s1' == p1 //true
For more information on storage precision, check out these items written by the owner of Stata William Gould HERE and also HERE