languag2
Wed 09/24/08
exam monday
caapt 1,3 4 first part of 5
Last time:
started chapter 5
what does it mean to be a variable
int x; string s;
.. even function
what are the atributes of a variable?
- name
- type - explicitly
- location
- value
- lifetime
- scope
we did name
talked about address - the fact that when we say int x , we assume it has the same spot .. but here is the memory cell htat ive been allocated
talked about type -- as detrmining the range of values
amount of memory you get rangee of hings you can put in there
and the operators that are good on it
binding (5.4 --> isn't on the exam)
binding:
def:
(1 association between an entity and its attribute
(2) assoc between sybmol adn its intended operation
example:
(1) int x; associated type int with x
int y = 3 .. associates the data type int iwth y and the value 3
(2) * multiplicaion for numeric -- all kinds of meanings
% integer division - gives remainder
++ for inc
when the association is made: -> the binding time
Possible times(6)
Language design time - when whoever's designing the language designs it .. ex: * means multiplication
language implementation time - aka compiler design time
ex: number of bytes for an int or a double
ascii or unicode?
3) compile time:
example: int x --> x with it's data type
.. entry in the symbol table for x is now an int .. and i store that
need that for later to allocate space
4) link time: address of a library function with its name
5) load time: storage associated with a variable
6) runtime: value of a variable bind 3*y with the name of x
a few more examples:
const size = 10;
const double pi = 3.14159
const float pi2 = 3.14159
float generates a warning because
floating point literal .. literally means exactly that..
well at compile design time, somebody decided that floating point literals have data type double
o that's a double . trying to assign a double to a float
sum = sum / 3;
when do I know the datatype of sum?
that's bound at compile time -- somewhere somebody specified the type of sum
possible values for sum - compiler design time -- "allocate 4 bytes"
meaning of slash - idea that it means division .. is probably language design time
but we also know that it's overloaded .. int / int = quotient
but int / double
or double / int
or double / double the result is a double .. traditional division
compiler .. compile time determines which of the above. .. based on parameters.
3 looks like?
also compiler disign time -- it's an int
but also something really interesting about how constants are handled
in C and C++ .. they're called literals
but they stay the constant as they go into machine language
in fortran, it's allocated memory .. makes a memory location an dstores the three there
o every time it needs the constant 3, it goes to that location
interesting problem .. i can change the constant during runtime
if I call func(x,3);
--> func(x, <variable>);
which means if that's a reference parameter I can change it!!!!!!!
class pet() {
public :
virtual void f() { cout << "feed me" ; }
}
class cat : public pet
void f .. cout feed me tuna
class dog : public pet
void f feed me steak
runtime binding
pet * p
ask the usr for input
p = new cat()
else
p = new dog();
p -> f();
here's a binding that happens at runtime
... there are three f's that could be called here
based on what the user enters
so even the funcion f is based on runtime
whether its a new dog or a new cat
before that bound to pet's f . becuase p is a pet pointer.
so that's an attribute .. address that's associated at runtime
5.4.1 Binding of Attributes:
static binding .. occurs before runtime and remains unchanged during program execution
int x --> for the lifetime of x (in C/ C++), x is an int
dynamic binding (not static) -- occurs during runtime or can change during program execution
int sum - static
but the value of sum is dynamic
pet p is mapped to a memory locaton
when i talk bout the contents of p .. that changes
p -> f() ..dynamic
data types in javascript are dynamically bound
if you don't test for data type, very weird thing can happen
000010 the string
vs 10 the int
5.4.2 type binding
static binding of types
before runtime, doesn't change
can do this via explicit delaration ..
int x;
explicit declaration of the type of x
list type with identifier
implicit "declaration"
in fortran variables beginning with an i, j, k, l, m, n are automatically integers
i = 17 .. assumes i is an int
else assumed to be reals
can turn that off by using implicit none
two instances of implicit in C/C++
const size = 10;
.. assumed to be an int
const pi = 3.14159; .. doesn't work
that works because every time you see size in the program it makes it a 10
that's how they designed the c++ compiler
function (int x) {
//
}
.. can forget the return type .. but this is also assumed to be an int
javascript .. all implicit
declaration
definition
explictingly declaring the type ..
but this is a veriable definition
definintion assigns an attribute and allocates storage.
declaration assigns attributes but ddoes not allocate store
declaration - const int size =10;
.. nothing is allocated as storage
type declaration:
struct kid {
string name;
int age;
}
dynamically binding types
most current languages require explicit declarations of type:
C/C++
Java
exceptions:
scripting languages
javascript, perl, ml, ruby
dynamic type binding:
variable is bound to a type when it's assigned
.. change its type depending on what I assign to it.
examples:
in perl
$id .. can be either a string or a numeric value
@name .. defines an array .. but we don't know of what
%name hash structure
$lucky = 5;
$lucky = "my lucky number is $lucky"
$lucky on the symbol table
started off as an int -- now it's a string
i changed it while the program was running!!
things that use a lot of dynamic type binding are usually not compiled languages .. they're interpreted languages
does't actually produce an exe
@kids = ("tim", "maggie", "travis");
@kids = ("adam", "chad", dave", "amanda", "courtney")
@kids (21,22,23,24)
it becomes whatever is on the right hand side
that's what dynmmaic binding is
javascript
nieces = ["amanda", "sarah"]
nieces = "kelsey"
advantage - don't have to worry about ints and floats
some advantages to this guy!
- fexibility** .. really incredibly flexible .. there are those who will argue tht this makes it a great beginning language
- #s
disadvantages:
- "error detection" of missmatched types .. none!
array *2 will get caught .. some willl get caught
but never the idea that left side can't grab the right side
less reliable
- costly execution time - updating symbol table or variable descriptor
because i'm always changing the type info
= realocate memory
-------
Fri 09/26/08
Lab 4 -
the four problems:
((!a) < b)
no definitions
example and say -- what are the non-terminals
so identify but don't need to define
get rid of eeft recursion
deal with where two guys are not pairwise disjoint
associativity and presdence
assoc. the order in which things are grouped if they have the same presidence.
regexp .. convenient for ebnf but .. won't have to write a reg exp
might as for valid strings from a regexp
great for lexical analysis.
expect several small questions
show something is abmiguous
remove left recursion
ebnf or bnf .. write on eof the functions
understand precidence and associativity
calculate first .. pairwise disjoint
derive a string
tell wahat lexical anaysis really means
why next token?
difference between lex and syn .. lex is part of syn.
how do we understand if a language is good
different types .. compiled, hybrid etc
end of chapter 1 questions are "really wonderful"
prefix & infix & postfix
won't have to write grammars for specific things
should be able to adapt one .. but not start from scratch
pairwise disjoint fix:
the test
A→α, A→β
if the f
A -> (+A
A→ (-A
A -> (A-hat
A-hat -> +A | -A
introduce a new variable that follows the common symbol.
example in test
var → id | id [expr]
var → id
var → id[expr]
first(id) = [a-z]
both ahve the same first
var → idV-hat
V-hat → nothing | [expr]
not always possible
removing left recursion:
A → Aa | Aab | bB | ba
introduce A-hat
A-hat → bBA-hat | baA-hat | aaA-hat | abA-hat | ε
for every terminal, add a-hat to the end , for every left-recursive .. move it to the back.
-------
Wed 10/01/08
The Plan
= chapter 5 today and monday
- monday - have exam back
"i like the stuff in chapter 5"
Chapter 5
- names - uppercase, lowercase, case sensative .. special words: keywords , predefined identifiers, reserved words
what's a variable anyway
has the following attributes
- name, address, type, value, lifetime, scope
.. those are the 6 topics for variables
we've talked about the idea of an address as a memory cell .. that we can alias .. we aren't worriedd about the machine architecutre address
when these things are bound
binding time .. when are these things assigned
six binding times:
-- associating an attribute with an entity
or associating an operaton with a symbol
and most of that happnes way early as opposed to the other stuff
six possible times:
language design time: as an example we talked about * means multiply .. the symbol got asssgned an operation
language implimentation time / compiler design time - the size of an int .. the comiler's resonsible for allocting memory se he has to know how many byytes
compile time: that's whne we associate the datatype with the name in c/c++
link and load
link: address of a funcion
load time: assign addresses to various things
runtime: values, some addresses, and of course lifetime and scope .. some of which is done by the comipler
int x .. the compiler comes accross that and says "x is an int"
.. this is all a rehash
5.4.1 - binding of attributes to variables
two kinds of binding
static binding and dynamic binding
static .. occurs before runtime and remains unchanged during program execution .. so in c/ c++ the datatype of a variable doesn't change during program execution
dynamic means that the finding first occurs during runtime OR can be changed during program execution
5.4.2 .. type binding .. when am i assigning the datatype to the variable
how is the type specified .. that was one quesion and we answered that
and when does the binding take place
we talked about explicit and implicit
5.4.2.1.
static binding .. stuff that occurs before runtime ..
type can be specified
explicitly: int x;
or implicitly:
implicit were things like in C++ .. const size = 10; .. size was an int
in fortran , anything beginning with an i, j, k, l, m, n was an integer
everythig else was a real
in perl
$name .. datatype is either numeric or string
or @ means array of something.
so i've said something about the tyype implicitly although i haven't really bound it to anything yet
so two ways of making static
dynamic type binding:
javascript
. can change the datatype of the variable any time I want
tyhe type of the variable is assigned dynamically whenever the variable is being assigned to.
.. convenient but less reliable
but the biggest problem with doing dynamic type binding .. type chekcing is done during runtime .. everything happens during runtime .. so have to be very careful
typically used by interpreters; not compilers
why can't the compiler deal with it?
suppose
c = A + B
the compiler has to generate machine language
.. need to know if they are ints, doubles, floats .. in order to get the right instruction
ADDL A, B
ADDS
ADDF
have to aave different instruciions in the machine language, and if you on't know ahead of time, can't generate the machine language.
and even worse when trying to do strings.
if types are going to change .. very difficult to generate machine langauge
so if i'm using dynmaic type chekcing, i'm typically using a n interpreter .. whihh is why it's so popular in scripting languages
type inference:
- types are infered from its context ie, what's around it
exaple the author uses
ML, a function language
fn area(x) = 3.14159 * x * x;
datatype of x?
datatype of result?
inferred that the result is going to be a float / double
but the datatype of x .. also going to be considered a float
so in this case, expecting a float
area(1.2)
but area(2) will also compile .. because ints coerce to floats
fun Double(x) = 2x;
datatype of x .. an int an d the result will also be an int .. becuase it sees an int
double(2) works
but double(1.2) fails
.. that's how the type inferrance works becuase 2 is an int.
fun Double2(x) = x + x;
what the heck does it do with that guy?
says numeric! which unfortunatly defaults to int.
Double2(2) = ok
double2(1.2) = not ok .. floats don't coerse to ints
fun Double3(x) : real = x + x .. specifies the return type as real
fun Double4(x:real) = x + x;
fun Double(x) = x + (x:real)
so its the context .. what's around it..
the type is implied .. inferred from what's around it.
5.3.4 - storage binding
we've taked about names, type . now we want to talk about when i bind a variable to its memory location
allocation - variable is bound to memory cell - in most cases cells
deallocation - the meory cells of a variable are made available for other uses.
lifetime of a variable: begins at allocation and ends at dealocation
when we talk about variables and thier lifetime, our author categorizes variables into 4 catagories based on lifetime
1. static
2. stack dynamic variables
3. explicit heap dynamic
4. implicit heap dynamic
remember, function calls and associated memory gets put on stack
further, certain variables are allocated from heap
1. static - bound before program execution begins - remember, binding memory loction to variable
- remains bound trhoughout program execution to the same set of memory cells
so when the program ends, no longer bound
examples:
in C/C++ . global variables
.. we don' know when the variables are bound for main .. probably not as soon as global variables
can also inside of function .. static int i .. also allocated space .. memory cell mapped to i .
it's lifetime is the whole program
static is "history sensative" .. allows us to remember ihnfo between separate function calls.
- implimentation of static variablessis very efficient
.. allocated and dealocated exactly one time! .. so always in the same physical address once the program beings .. allows direct addressing
fortran - 1980's .. all variables were static
. pass by refrence is really tough
.. function A can't call function A !!! no recursion possible!!
.. bummer!
another drawback to all variables static for memory binding.
function A
int a[1000];
function B
int b[100000];
i consume all that memory ALWAYS!
no sharing due to inactivity
no recursion, no sharing but efficient
fun()
static int a;
static classmember . has nothing to do with this
2. stack dynamic variables.
- refers to the runtime stack .. the storage bindings are created when declaration statements are elaborated, but who's types are statically bound
int main()
{ int x;
...
in C and C++ we don't allocate space until a function is called.
it is an int .. we know that.
the starage binding exists as long as the function does aad its gone when the function is gone
what does it mean to be elaborated.
if I ever call this funciion, x will get the sapce necessary for an int .. so when th functin is executed, the declaration will be elaborated.
so this is what we are most used to.
allows recursion .. because every function gets its own frame on the stack
only using the space I need.
now i can do recursion
3. explicit heap dynamic
int * p;
p is allocated stack dynamic
somewhere out in the heap -- memory shared by variaous processes --
when you say p = new int;
you've just gotten a space on the heap
on the heap - nameless .. i get to refer to that through * p but he's nameless.
that are allocated and dealocated explicitly during runtime by program instructions
allocated - get memory location
all dealocation does is mark the space as available
if the function ends, it still belongs, but we can't get there.
some languages do garbage collection ex: java .. will find those guys and return them back.
and that's aprticularly important because all but scaler / numeric variables are allocated on the heap
.. all but the most prmitive datatypes are objects .. all are allocated on the heap - very different than c++, much more like c#
declaration in C# to allow you to use pointers
but this is costly! .. if every time I want a variable i have to go to the heap .. two memory calls
aweful lot of memory access just tt get to the guy i want
that's 3 out of the four
-------
Mon 10/06/08
phrases -- all things that decend from from a variable
simple phrases --- can happen in 1 step
handle .. is the leftmost
chapter 5.
attributes of variables
---------------------
5.4 - binding of variables to memory
static variables - bound before the rpogram execution begins .. bound to the very same piece of memory throughout .. global variables and cin and cout
2. stack dynamic variables -- anything that goes on the run stack .. activatioin records .. all go on the stack
per functon call
variables that belong to those functions are placeddon the stack ithin their onw activation record
this is what we're most used to
3. explicit heap dynamic
.. ones that we explicitly allocate and dealicate
new and delete .. we ask for memory and we give it back
out in the heap .. not part of the stack
request memory cells .. very important that we give back .. typically in shared memory
4. implicit heap dynamic variables
these are things that we're not well acquainted with in c and c++
bound to head storeage when assigned
so this is what happens in javascript for all variables .. whenever I assign, i change its type
- in fact, all attributes are bound whenever an assignment is made -> javascript
Type checking
- - - - - -
realted to a varaible .. all (?) variables have a datatype
idea that ..making sure that trying to make sure we have a leagal operaton
idea of appropriate opperands for operators
a + b .. are a and b legal operands for +
f(a,b) .. f is technically an operation and a and b are it's operands
sometimes it's good enough to be compatible
we sayh something is compatible if it's legal for the operation
so in this case, clearly ints are alowed
also,, assignment is an oprator .. can i coerce or convert
allowed by the language rules to be explictily or implicitly converted/coerced.
we know if a is an int and x is a double that a=x is allowed, ints and doubles are compatible types but not identical
void f( int a, double &y);
f(a,x) is ok
f(x,x) is ok .. i can shove a double into an int
f(x,a) becuase the reference parameter needs identical types
type error --
char ch = "mcvey";
when do we know we have a type error?
- depends if types are bound statically before program execution, we'll know before the program begins
/ at link time if it comes from a separate file
if types are bound dynamically, as in , javascript
then of course i have to do it during runtime
Strongly typed languages - coined in the 1970's
means that all type errors are always detected
C & C++ are not strongly typed
union utype
{ struct xtype
{int a;} x;
struct
{int b} y;
struct
{int a, double b} z;
} u;
u.x.a = 75;
u.z.a
I have an array of u type and each element of the array can be just a little different
if I have something of utype, might be a at one point, b at another pont
overly .. gives one chunch of memory .. sometimes its and inta a with a little extra but gotta allocate enought space to hold both and into and a double
variant .. gotta be treated as one of the three.
if (input = __)
u.x.a = ___
else if( ..
so there's no way I can type check .. so it's not strongly typed
otherwise, c++ is strongly typed (throw polymorphism out the window too)
strongly-typed is desireable .. almost no languages achieve it
type equivalence -
when are two types the same type
struct xtype {
int a; double b;};
struct ytype {
int a; double b;};
struct ztype {
int c; double z;};
i have three data types
xtype x;
ytype y;
ztype z;
can i say x = y?
can I say x = z;
name equivalence vs structural equi
c++ uses name equivalence .. if the name isn't the same, it's not the same type.
typedef Ztype Fred;
Fred f;
f = z;
typedef .. take an existing type and make fred be an equivalent type.
C cheats -- what a typedef does is what it does with const .. everywhere it sees fred it puts ztype
structural equivalence .. x, y, and z type would all be the same
type checking of that is horrendously difficult compared to type checking by name.
-------
Mon 10/08/08
5.8 Scope
why would I want structural equivalence?
as definited las class
we look at these and know they're structurally the same
by name equivalence , everybody's different
structural equivalence .. well it depends
two ways that this is implimented - names and datatypes of fields must match
or datatypes of fields are good enough
so if just dattype is needed .. all three are equivalent
otherwise only a and b match
so iddferent ways of defini;ng structureal equivalence
primary advantage : if we're trying to break a program into components and you'll write part andd I write part and we want to combine but use separate files, I need to deal with datatypes that aren't defined in my file
.. so you don't have dto deal with a global file .. so gives me fexiblility
but also a bear to type check.
testing equvallence at the datatype level is very time consuming and hard
name equivalence is a one time check
5.8 is about scope
dealt with that last time in lab
int i
for (i= ...
and then try to use i later
that way is happy
declared in the for lop .. it depends
scope rules for c and c++ aave been tighteened up over time
6.0 allows you to do anything .. .Net is more restrictive
define these things formally.
definitions:
a variable is visible in a statement if it can be reference in that statement.
the scope of a variable is the range of statements in which the variable is visible.
A variable is local to a program unit if it is declared there
a variable is non local to a program unit if it is visible but not declared there
these definitins probably see pretty trivial given C and C++
but the deal is not everyohne scopes the same way.
so some interesting things
you know that in C or C++ if I say:
int i;
i = 17;
{
int i;
i = 45;
cout << i;
}
cout << i;
I get two different outputs
i1 is local to the blovvk that is outside the curlies
i2 is local in the inner block
he's not visible to the inner block because he is overridden
.. not a good coding style ... but the scoe of the variable
nesting block statements can get us into trouble
java and c# dont allow nesting definitions of the same identifiers
with respect to c and c++ we don't have a lot of issues with local and non local
int y;
in t main()
{
int xy:
}
void f ()
int x, z;
}
void g()
{
int a, x;
}
where is y visible? .. we can't reference it insde main .. our local y whies it out
can reference the y foom the body of f and g - that's the visiblility
where's f visible: only in f or g
x -- only in the body of main .. voth visible and scope .. he's local
y is non-local to main , f, and g .. that's the idea of local vs non local
is it declared there?
either it's global or lits local .. only way i can have non-local identifier is if they're global identifiers .. such of funcion names or global variables or global constants
so sccping in c and c++ is pretty easy
c is stricter
in c++
void h() {
cout << aaa;
int x;
}
in C .. all variable definitions must occur before any executable statements
o the variable declarations are separated from the body in C
so the scope of x in h() is not the body of h .. only from it's declaration on down .. in C .. it can only be the whole function
... down to the closing of the block
other languages allow soething a lot more interesting
this discussion of scope does not apply to OOP .. public and private variables --> more complex
one thing they don't allow .. c, c++ .. do not allow nested sub programs
int main()
{
int x
void f {
//
} // local function declaration
}
not allowed in c and c++
is alowed in a lot of other languages
there are some good reasons for doing so and some crappy reasons
bmcvey prefers the c an c++
a variable local to main .. not local to f .. a sub program .. but is visible to f in a variety of languages.
pascal allows nesting of subpgrams .. see handout
in pascal the executable statements start with begin and end in end.
program a: defines program b and c in it's body
see handout
in pascal , a procedure first comes it's declaration
and its executable portion
and a proedure can declare another procedure
pascal function and procedure .. like in qbasic
static scoping.
- known before runtime
dynamic scoping
- determined during runtime
dynamic scoping is scary!!
but static .. should bbe able to look at the code and know exactly which variable i'm referring to.
suppose B were main. .. then x is a global
so inside of b, can use x:A
idea of static scoping ... OK I have this identifier
step 1 .. is it declared locally .. if it is, that's it's scope .. that block
so the y definined in B ist's scope is function d and B
but if I were to make reference to x in the block of b, it's not in B, so I go one step out .. and that's A
if it's declared locally, local scope
else, look at the static parent .. enclosing block
so each time I try to step just a little bit further out.
static parent and ancestory
so if I have a variable i'm referencing in E, I look to see if it's declared in E
if in E i'm using a y, i look in e .. no y .. go up to c .. no y .. go to a . there is no y .. then this is a compiler error
static scoping .. i can look at the code and teel me which guy i get to use and am using .. based entirely on where things are definted in my program
so in C, what can I access?
x:c z:c E F .. not y .. doesn't know anything about y
also, B .. they're in the same block and B came before.
x:A is obliterated by the x:C
depends on the language whether B can call C or not .. in C++ .. needs prototypes .. pascal was the same way
who can use a's x?
B, D
some advantages .. you can nest sort and search so you can use different alg for different situations
read 5.8
what if I want to allow F to call B. -- yes! F knows about B
.. everyone knows about B
as you step out of your block, you get the stuff that's out there
dynamic scoping:
could be implimented in C++
it depends on the order of the function calls
.. determines scope .. ie, which variable
and that is really ugly
when we did staci .. we lookd at static parent
my static parent is determined by where in the program I'm defined
my dynamic parent .. the function that called me.
deal .. if x=7
did it definte it in function 1
if i did , that's who' got the 7
if not found, I look in fuction 3 ..
. . . the order of execution determines which x I assign the 32 to .. that certainly doesn't seem natural. no good reason for this
so we are typ8ically involved in static scoping
most languages .. such as imperative .. in static scoping
if main calls f1, f2
returns ..........
so I could be depending on order of execution be assign to two different variables with the same line of code.
scope and lifetime are not alwya sreaated .. a variable can exist but not be isible
-------
Mon 10/13/08
Chapter 5 - done!
- binding of attributes
LAST TIME:
- static scoping
- dynamic scoping
5.9
5.11 .. named constants
5.10 referencing environments .. inverse relationship of scope .. read it but no lecuture. --> vs say here's the variable .. it's here's where I am, what can I use
chapter 6 - data types
fortran is great at large numbers .. cobal only wants money
so let's look at datatypes
datatype: a collection of values and the operations on them
they've evolved over time .. fortran was great at floating points
cobal is pretty good a strings.
strings are interesting -- are they primitive or something more ..
when we add two numbers the hardware does it
adding two strings .. very very different
.. than arithmentic logic unit
scalar types
double, int, char - can also call them in some sones the primatives
everuthing we can build on
and then the structured types
arrays, records (structs), and deptending on the language the strings are somewhere inbetween
regardless of the type, variables typically have what are called descriptors
descriptor contains the attriubtes of a variable of which one is the type
sometimes the descriptor is static .. if no changes at compile time
dynamic if changes are made to he addtributes
.. and this is of course, excpet fo the attribute of value
in a lang where all the attributes are static .. the descriptor is only needed when the compiler is running
in a situation wherre i can change .. the descriptor hands around during execution
one of thte things the discriptor is used for is type checking .. whether it be the compiler or during program exe
so when we talk about checkng the variable type, we're looking at descriptor
for us we can just say "we're looking at the variable"
primitive datatypes int, float, char, ...
we know that thy're stored differently etc
something way more interesting is theenext section .. character string type
throughout the section .. first points out the design issues
.. what do we want to be able to do with them and how shall we impoolimient hhem
string -- a sequence of characters
design issues: is it a primitive type? or a special kind of character array
other thing: static or dynamic length.
primitive data types -- atoms .. they're the part that you can't break apart
the smallest part that's a while by itself.
an int, i need the whole thing to be meaningful
a string .. what do i need to be meaningful .. is it just a colleciion of chars?
if a string is a primitive, I can't get to the individual characters -- I either get all or nothing.. unpartitionable
so a primitive string .. assign it once, and that's what it is, pretty much constant.
"almost" because several languages allow me to actually search it for a pattern but not to manipulate a string like we thing of it.
Operatons on /with strings
1) add -> concatinate
2) ask about the length
3) search / pattern match
4) split
5) replace
6) assign
7) compare
.. have to be able to do sbustring reference .. get at a part of it
and perl is great for pattern matching
how different languages implement strings
so the first thing is to decide whether it's a primitive or specialized character array
is it always length x or can I change it
in C++, when I use a string
string s1; I do indeed have a string
in c
char s1[10];
char s2[] = "McVey"; really 6 because of null terminate
examples:
C++
s1, s2, s3
s1 + s2
s1 = s2
....
so up here in c++, the string is a datatype .. but a classs .. am using built in functions it just doesn't look like it
c
---
s1[10], s2[10];
can't do s1 = s2;
strcat vs just +
strcpy(s1,s3);
in c++ if s1 > s2
vs c
if strcpr(s1,s2)
-1, 0, 1 based on which is greater
all kinds of issues with length with just these two simple implimentations
perl
----
$x = "mcvey";
can't do $x[i];
java
----
String class
manipulate .. String.buffer class
very array like . once you create a string , cant change it
stringbuffer is changeable
substrings, pattern match
most of these languages that allow me to manipulate strings allow me to search them.
bigger implimentation issues is about the length
static length .. fixed and can't change it once created it
doesn't mean all the strings have the same length
in C, char s[10]; // maxlength of 10
everything is null terminated so when it runs into the null, that's the end
an do all 10
allocated a particuolar part of memory
certainly for the string primitive .. that's what i've created , that's what i've got
or can have dynamic length.
- variable length with repsent to a max
- variable length unlimited.
when we impliment strings in c++ they are dynamically allocated.
.. sizeof() is the size of the object
length vs capacity
c++ adds 32 as needed.
so do i have unlimited capacity? well eventually we'll run out of heap space
static char array -- oughta know its size .. but it doesn't do bounds checking.
static array so it's descriptor should look something like
static array of char
need it's length of 10
and then it's address of a[0]
string
dynamic array of char
max length (capacity)
current length (length)
address
that kind of information that the descriptor needs to store if it want to say if htat thype is being used correclty
and needed for range checking
c++ maintinas that info, but does not do bounds checking unless you ask.
b.at(i) -- will check to see if 0≤ i≤ length
implimentation:
- 1. linked list to store dynamic strings .. one char at a time -- not practical but could use arrays of 32
2. array of pointers -
3. the heap - always allocating and deallocating .. if i need more, i get more.
so I spend time copying in #3.
6.4 user declared types .. she'll come back to that
6.5 - array types
design issues .. what are legal subscripts
everything we've seen uese and integer as a subscript
should we do range checking on indicies?
when does allocation occur.
can ask for a huge array and never use it
ragged or rectangular
slicing -- how can I pull the array into pieces
array
array name
some index
and we map that to some element
a(i) or a[i]
or a[i][j] or a[i, j]
name of the array as address of the first thing
address of a[0] = a
a[0] = contents of a * 0 * sizeof(type)
chapter 6 {2,.... .. record types
-------
Wed 10/15/08
data type - collection of data values and predefined operations on them
- descriptor - collection of the atributes of a variable .. such as type and vlaue, memory lication , ragnes and things like that .. all of those things can be static or dynmaic
if all attributes are static, other than value, the descriptor is only needed during compile time, not exe time
if anything is dynaci , then i need it during runtime
usually built by the compiler as much as possible .. what i know is filled in
.. and part of the symbol table .. big table of the identifiers and their information
that we begin creating during lexical analysis
used for type check .. if the type is know at compile time , i can type check then, otherwise, need the descriptor so i can type check during runtime
used for building code needed for storage allocation and deallocation
so the things that we're talking about .. we'll draw them :-D
descriptor → what keeps the attributes and varaables.
primitive types:
primitive is a type that are not defined in terms of other types
what string could mean as a primitive type or a character array -> def. invivles character
three main primitive types - numeric, boolean, and char
talks about ascii, unicode .. way back in the 70s, there wasn't one character association
bool .. one byte
languages that don't have bool, 0 is false and non-zero is true
numeric .. integer -- and even in c and c++ a ton
int, unsigned int, short int, long int
2's compliment and 1's compliment
floating point numbers: double, float .. a double has more digits therefore better prcision and range
will have a complex numberprivative:
a+bi.
decimals vs double sand floats
we know that ther's all kinds of numbers that we can't represesent accurately
1/10 is repeating with respect to zero's and 1's
.. can't represent that exactly .. but we better be able to do a dime
so when it stores decimal numbrs, how many bits do I need to store a digit .. 0 -9
.. need four bits to present a decimal digit
so I can represent two decimal digits in a byte
how do you do arithmetic on that?
either the compuer is built to handle decimals .. or it's all done with softwae .. which is hugely slow .. but often this is what they want.
6.3 character string types
- sequence of characters
design issues -- 1) should it be primative or not -> special array of char
some languages say primitive , some say array, some say both
lenght - static or allow it to change
languages do make these dicssions, but they are not very good at telling you which it is.
decisions made by the language developers and implimentors
operations
- assignment .. can I assign one string to another, concat, identify substrings, manipulate subsrings .. compare two strings
pattern matching
simple things such ae compare and assign depend on what has happneed with theelength or whether it's primitive.
so whee i talk about things being
what about two strings of different length?
. affects assignment .. if hte guy is bigger on the lft vs smaller on the right . that's usually pretty easy to solve
copy string and then what do you do with the extra room .. put in null terminator for c
if bigger - on right
in most cases, runs past memory.
.. are we notified that this is a problem? some languages will say you can't do this .. and some, like C just say "ok"
and then you'r program will crash when you try to ause the dat you wrote over
primitive string vs non-primitive .. java
Java string class.
primitive class calle dstring
and array of char called StringBuffer
with string primitive .. can't just swap characters
can't do it with the actual string .. its primitve .. can still access but can't go messing around wiht the chars.
concat with + nd some pattern matching through some library
primitive .. you don't get to mess with it .. the way it is once it's created
treated as a constant
can ask for substring
split string into pieces
can chop off part of it
trim is about whitespace
these are class functions that manipulate the string .. i'm not manipulating it throuhg subscripts
someimes in pattern matching, we can use operatorrs, but in c and c+ we pattern match using librarys
6.3.3 - string length options:
most intersted in what the terms mean in this section.
static length string - fixed foom the beginning -- set when created.
- java string class
- pysthon . really intersting .. aloows us to do stuff with theestring, but then there are other things that are primitive
- isn't a c-style array
- limited dynamic length string - max length - can be anywhere from 0 to maxlength .. c++ style string
- dynamic length string
static descriptor
- has length and address .. length is always the same
limited dynamic length string descriptor
maxlength
currentlength
address
intersting things about string like this .. the descriptor has to be there during exe time so that we can change the length.
Dynamic .. as long as I want
descritor:
current length
address
C.
char s5[5]; - limited dynamic length string .. limited by the 5.
but there's nothing to stop me from copying into a new string with longer maxlength.
. even though the size of the array is 5, I get cat
.. so really store : cat\0
so in C, we don't need currently length and C doesn't care about the maximum length either.
three implimentations
linked list
array of pointers
contiguous memory - allocate and deallocate whenver I need more space
lnked list .. good for sizes that change a lot
array of pointers -- itereating sequential places in memory -- a little better . not a lot
continguous .. get to step one bite to the next . but takes time alocating and deallocating
.. and during the copy, we have almost twice as much memory as needed.
6.5 - Array types
- homogenous . aggregate of data elements in which an individual element is dentified by its position relative to the first
design issues - what are the legal subcripts
do I do range checking?
when are subscript ranges bound - binding time
when does array allocation take place
5. ragged or rectangular or both
6. initialization - can i initialize while i'm allocating
7. what kinds of slices .. subarrays
Indices
- - - - -
array, subscript --> map to eelment
a[i] is really a +i(sizeof(dattype); .. pysical .. we don't have to kow that .. it does it for us
a[][]
or b[][] .. but there are folds who di ti like [a,b] and (a(i) or a(i,j)
what are the valid indecies and do we chekc ranges
c , c++ , perl and fortran .. no range checking
java, c# do. --> slower
ADA - optional
C++ .. if I used a vector class, really an array, a[i] does not range check a.at(i) returns the same thing and does range check.
languages based off C all tart index at 0
fortran - starts at 1
Pascal/ ADA ..
-> type Atype = array[1,10] of ___;
this is indexed foom 1 to 10.
can specify my indecies to be characters
for a two d array
[1-100, a-z] .. so now I have a spreadsheet
array [1980-1999]
can be all kinds of things .. amy or may not be checked for range -> slows things down.
binding of subscripts .. wwen do they happen
subscript type - c and c++ is always an int
in pascal , it could be different
but this is usually always statically bound
subscript ranges - somtimes, in fact, many times, are dynmamically bound
storage allocation .. could be on the heap, ocould be on the stack, could be static, dynamic
categories:
1) static array - don't confuse with basic c++ array
range an allocation are done before runtime
- most efficient way of doing things .. no dynamic allocation
static int a [] = [2,3,5]
doest change
2. more common - fixed stack dynamic - subscript range is statically bound . known at compile time!
allocation of stroage is done at delcaration / elaboration
.. wee i encounter the declaration inside my program .. more efficient space wise .. because if I never call f, I on't need the array in it.
stack dynamic -
subscript range and stroage allocation are done dynamically
.. at elaboration time
now the range was well as the allocation
would allow
int B[a]; // can use a variable!
.. can't do this, but that's what it means .. when i get to this line right here, I'm allocated storage at this point.
-------
Fri 10/17/08
see handouts.
rangecheck.cpp
when multiplying 2d arrays, lots of access to arrays .. range check would really slow things down.
time:
32 vs 47 for range checking .. increased the time by half
so while develping I like the idea of range checking . once it's sound, i'd ove to turn it off
.. so ADA has it right .. defaults on but gives you a choice.
array3.pl
---------
perl is one of th most flexible languages we have as far as what's built in for arrays
i can define and initialize - not all languages do
allow the array to figure out it's size
in c++
can't say cout << A;
or cin >> A;
perl says "why not!"
print @a;
@a = <STDIN>;
can do it all at once, element by elements
arrays know their own length
scalar(@a) .. gives the number of elements in the array.
C++ .. they don't know
push puts something at the end of the arraya .. uses the length again ... @a is treated as an integer here rather than the whole array as in the print statement
use of sqare brackets also allocates more space
pop(a@) .. removes last itel
shift moves everyone forward and romves the first guy
splice ..cut out a piece .. perl does that nicely
splice works on 2d and 3d
so we can do all sorts of stuff that we couldn't do in c++
can change the size ..
can assign an array to another array
in C++ .. there's no way.
cant ask if A==B in C++
seems reasonable
@a eq @c in perl
.. but just comparing their length
o perl lets me do all kinds of things others don't
if I add the 25th item to an array it will make the spots up to 25
so no such thing as range checking .. it'll just expand it.
@a[-2] .. counts up from teh bottom!
vector.cpp
C languages .. a vector is very much a dynamic array
.. at it's core is a dynamic array
.. the actual space is located out on the heap
the vector class does everything (almost) that perl does.
vectors know how long they are.
.. can dynamically create space: pushback
vectors can be accessed in exactly the same way that arrays are
print function comes in .. I don't need to send in the size of it.
pop_back() does exactly what perl did for pop
so this datascructure is very mich like the perl arrays
allowed to do assignment of vectors
did it make a shallow copy? nope!
this time we can say it compares by values rather than length
can't read in the whole vector at once
nor can we cout a vector all at once
vector initializes everything to zero
C.at() range checks.
we can go past the brackets, but the vector size is still the original size.
comparison is very different.
also, perl and javascript don't care if you have different stypes in your array.
5 categories
- range binding, allocation binding, and where the allocation is made
1. Static Array - most boring .. everything is known before runtime,: range, bound, allocation made, on stack before runtime.
in C++
STATIC INT c[5] = {BLAH};
WHEN i load it, i know it
fixed stack dynamic
"stack dynamic" = allocated on the stack
range is statically bound - beofre runtime
- allocated at elaboration
.. way more efficient on sapce wise .. only allocate if I need.
3. stack dynamic
- allocated on the stack
- range is dynamically bound at elaboration
- allocated at elaboration time
int x;
cin >> x;
int a[x];
.. we don't have these but the cool thing about these guys .. the size is specified when created.
4. fixed heap dynamic
- allocated on the heap
- subscript range and storage are bound when program requests storage
we're fixed from thh pv that we can't change the range unless I reallocate
o that's
int *p
p = new int[5];
at this point, allocating space on the heap, p is tied to that address and my range is 0 to 4
p is a local variable on the stack and allocating the space out in the help
and i cannot make that group or shrink unless later I say give me more space.
allow arrays in Java are allocated this way .. java does not allow you a ponter .. it does allow you to get space, it handles all the returning.
C# as well.
what's nice is that if I really do want to allocate and eallocate , C# and java has eliminated a lot of the pointer errors, but it's not always real clear what happenes in the background.
5. the most free-wheeling - heap dynamic array
good old perl
- allocated on heap
- range is "bound" when elaborated but may change often during execution
@a = {1,2,4}
$a[15] = 15;
.. we go from size 3 to size 16.
storage is "bound" at elaboration but also may change during execution
we don't know how perl stores this stuff
C# ArrayList .. does this
very interesting stuff to pull out a chunk
RowMajor and ColumnMajor
suppose
A
[ 2 3 4 5]
[ 1 3 4 5]
if that's row major, in memory it looks like this
2
3
4
5
1
3
4
5
in column major, it would look like
2
1
3
3
...
column major was fortran
pretty much everything else is row major.
have to specify just one bound to pass -- C++ it's the column so that we know when the split is.
A[i][j]
&A + i * number of columns, ie rowsize + j
and that is for row major
column major .. different computation
-------
Mon 10/20/08
Exam 2 - Monday, Nov 3
Perl and some of it's features ..
how different atatype are implemented byvarious languages
Last time:
- - - -
- arrays
today
- associative arrays
- structs
- pointers/references
section 6.6 associative array.
- unordered (!) collection of data elements that are indexed by an equal number of values called keys
(inidicies of non-associative arrays are not stored
int a[5] .. the descriptor says arrayy of int .. but not storing the indicies as anything .. we know what they are
in pascal .. array of something
have alow index, and a high index
and probably have a datatyype of index .. need all of that because i can go from a to z as indecies or 1990 to 1999
associative arrays .. no such thing as indecies key to item
perl, python, and ruby support these
python call them dictionaries
perl calls it a hash .. has object that we were looking at
so these are actual datatypes in those lanagues
others support these in a class librarys
C: C++, Java, C#
design issue -- what are the allowable keys?
.. typically they're either strings or numeric
perl calls these guys a hash because the key values are hashed .. stored according to hash values of the keys.
hash funciion takes some piece of data and produces some number
based on that numbr, I might put things in a particular list
we dont have to know the hash function that perl uses to use it
so examples will only exploit how things are stored
%ages is a hash
available on G: ..
("ian" => 11)
associate key ian with value 11
like parallel arrays .. bhut you can have any many as you want that are parallel and here you're associatiating a key with the value
$age{"ian"}
.. uses the heap
ust like wwen I was doing things with arrays, I can create extra keys and values
so the next time that it printed, prints the added keys
does not sotre in order
value() function returns the values for a given key
sort based on keys
reverse sort based on keys.
when I sort keys, I can sort based on the actual string .. can sort on the hash value or i can do it on the actual strings or keys
sort numerically on keys
and then by values
can use #s or keys and strings for values, of course
while loop assign all key valu pairs .. ends when you cant assign
ask "if exists" to search
hash .. not trying to process everything in the hash.
might as well use an array to lopp through everything
hahs functions get to the desired thing in 1 step, if it's a good function
whant happens in my implementation
keys are stored by whatever number they hash to and somewhere else stt the values
run out of room on the table, need to incrase it .. generally by a power of two
.. when I incrase the size of the hash table, only have of the elements will have to move when hash is expanded
dynamic .. heap -- space is allocated and dealoc as necessary .. well we don't know if perl give it back
C++ doesn't give it back .. just keeps it until you need it again.
$a <=> $b increasing
$b <=> $a decreasing
built in hashes that might be of use
built in hash called env .. environment variables
talking about what's around where this program is running.
don't assign hashes from sorting.
.. doesn't work as expected.
load a file into a hash
curly braces .. alert the thing that it's a hash ..not array
build a hash on the fly
faster than searching in an array -- search by key
not meant for lists but are meant for searching
current size relates to number of digits of hashed values to use
kinda like parallel arrays of only two
6.7 record types
look at cobal stuff
heterogenious agregate of data elementes in which elements are identified by name
C/C++ : structs
struct kidtyle {
string name;
int age;
};
kidtype k;
k.name = "ian";
...
identifiying the particular data elements with it's name
a descriptor for thesee guys is not a fixed size
..based on the number of fields I have
has the idea taat I am a struct and it has to have the idea of a struct of what
has
name
type
offset
.. for each field
so for eadch field in the struct I'm must know name , datatype .. offset helps me to determine where the next guy is.
all kinds of things that go into organizing the data .. compilers might try to fit it better than order you have
so regardless of whether i'm in the stack or the heap i need to know name, type and offset becaase momory is being organized
cobal calls them records
pascal aslo calls them records
c/C++ - structures
C++/Java -- classes ... data and functions, usually. structures are usually just data.
C structs go on the stack
C++ class objects go on the stack
Java classes go on the heap
C# tries to be like java so those also go on th heap
pascel and colbal were certainly stack
where these things are stored is aslo different based on language
design issues
- how to reference the fields
2. what about elliptical (ie, circular)
we know that in c, c++, we do the object name dot the member name .. we use the dot operator for reference.
struct kidtype {
int a;
char b;
kidtype c;
}
^ ^ eliptical
tryihng to define kidtype with kidtype
very difficult to think of what this structure would look like
but we can do pointers .. that's how we get linked lists.
some languages implement this -- she doesn't know how.
.. if everything is on the heap, not trying to allocate space .. can probably get away with it.
ast datatype : one that give people particular grief .. certain languages have tried to eliminate that grief
skipping 6.8 entirely
6.9 - pointer types and reference types
pointer variables
- store addresses
- range {all possible addresses} U {some null}
two purposes for pointer variables
1. indirect addressing p→ name or *p.name
2. manage dynamic memory
w/o pointers in C and C++, I can't do a linked list unless it has a fixed size. .. can never get past the memory that I asked for.
pointers allow me to request dynmaic memory and return it as well
operations on pointers:
----------------------
int * p;
can I say p = 10;? nope .. can only asssgn 0
C++ won't allow this, C will
p = p+2; .. can do that .. add 8 to p.
C/C++ do scaled arithmetic operations on pointers.
so allows some arithmetic
p = p+2 makes sense on an array only
p==q .. are we pointing to the same thing?
p > q .. meaningless unless using the same array
can't : cin >> p;
sometimes doesn't like:
cout << p;
if you give a pointer to a char, it gives you the string.
new (malloc)
delete to give them up
reference type:
in C++
void f(int & x);
int main() {
int a= 17;
f(a);
cout << a;
}
void f(int & x) {
x++;
}
you think that it's a pointer, but its a reference type .. referring to this memory location as if it were there
not a pointer .. don't dereference
not a pointer, it's a reference .. x does not hold the address of anything.
reference object .. in Java and C# .. created on the heap and must use new .. but not a pointer .. never have an address
reference types are different than pointers.
-------
Wed 10/22/08
pointers and reference variables
Pointer types vs reference types
pointer type
- one in which a variables have a range of addresses and NULL / 0
fundamental operations - so most languages that have poiter types alow assignment and dereferencing
others: some allow pointer arrithmetic, printing of actual addresses
reference type
similar to a pointer type BUT
whereas a pointer refers to an address - contains an address, a reference simply refers to an object or value in memory
so object in memory vs location->for pointer
the big difference is NO arithmetic .. it doesnt make sense .. not talking about an actual address
- it probably does contain an address, but it doesn't treat it as such
pointer variables:
- two problems
int * p;
int * q;
p = new int;
q = p;
*p = 17; // dereferencing a pointer
delete p;
q is what is now called a dangling reference -- it doesn't know that it was deleted
.. so this is a problem of any language that allows pinter variables
typically explicit allocation and deallocation
explicitly said give me more meory and give it back - "New" and "delete"
in C - malloc and free
so this idea of dangling references is a problem that programmers run into with pointers
problem #2:
in * p = new int;
*p=17;
p = new int;
*p = 85;
this is the other problem . a memory location for the pointer .. stores an addres to a location in memory where i put the 17
and then I say I want another one .. maybe at 212 .. so no longer pointing to original memory
so the 17 is caaled garbage .. also .. lost heap dynamic variable
something that I own that I can't even find
solutions to the dangling pointer problem
- tumbstones .. tell you that something is there .. a marker that exists per pointer variable (so each pointer variable has one) .. it exists as long as the ponter variable exists ..
int * p; also going to say that I have a tumbstone for p
says I have memory or I don't have memory .. another level of indirection
so when I want to dereference p - i have to make sure that I indeed have space -- that this is a valid pointer
extra thing that says if it contains a valid address or not
so something is managing the tumbstones
and sets the tumbstones
that is not language dependant
another method .. ekys and locks .
2. keys and locks
- pointer variable has key and address
when i allocate the variable on the heap .. gets two chuncks of memory -- gets address and a key
I can't allocate meeory dynamically w/o an assoccated ponter .. both get the key
int *p;
p = new int;
so every time I try to access *p, checks the key value of p with the key value of address stored in p .. and if they agree, life is good.
int *q = p;
now q also looks like 777 and alike to that memory .. so both pointers have the same key as the dynamically allocated memory has
but now if I say delete p;
, then key is not 777 anymore .. so when I delete p, that guy is no longer anything, q still holds 77 .. if i try to use q, it compares 777 vs -1 and says I cant use that.
so does not allow you to follow a pointer that doesn't exist anymore
so if when i dump i cahnge the key, then i know it isn't good
so two ways to deal with this
our second way costs extra space -- in a shared memory environment that's big .. also costs me a comparison
third mechanism:
do not allow programmers to explicitly deallocate
.. so there is no delete .. you never get to give it back .. you cannot create a dnagling pointer on your own
so rather than catch you if you did, they just won't let you delete anything .. that's C#, Java
but if you don't allow, what happens?
so how do I know? .. that's OS .. heap management - the OS does this anyway! as your program runs .. like stack space and heap sapce and when the porgram dies, the OS figures out what you haven't given back
garbage collection in java and c#:
- two ways:
- eager method - whenver something becomes inaccessible, mark it as such
- lazy method - whenever i'm out of space, then check - don't throw anything out until everything's full
eager: when I'm done with it, i'll give it back
when I allocate a dynamic memory, I also allocate a counter = 1 .. somebody is looking at me .. somebody's using me
int * p = new int;
the space is being allocatd 1 time
if I say q = p;
q is alos looking at that guy, counter up to 2
p = new int;
.. now counter goes back to 1
so p and q have to know where the counter is but it's part of the thing that it's referencing
when the counter is down to zero, memory is returned.
eager spends a lot of time and spends a little space .. for counter.
so why can't people learn to program with pointers -- that's what C does .. assumes you'll get it right.
different languages and how they use reference types.
reference type is truly the idea of let's don't allow programmers to explicitly deallocate
refers to an object or value .. technically a pointer, but it's not
in C++, a reference type is a constant pointer
in t&ref_result = result;
see handout
reference to a char is technically a char .. not a reference to a char.
reference type refers to the object
reference types i don't explicitly dereference
automatically gets defererenced
can think of them as an alias
passing parameters - c++ allows reference parameters, c never did
x refers to the object
implicitly dereferenced!
c++ attempted to give you a reference type .. very useful for functions
Java
- - -
javac first.java
java Cat // takes name off of first class
see handout
new allocates space
kitty will refer (reference type) to that object
objects of classes in Java are all reference variables.
no pinters involved -- just reference variables
every non-primitive type in Java are referenced.
doesn't work that way for ints
x and y probably alocated on the stack.
C#
--
primitives .. int
objects everything else.
references are indeed references -- when I assign objects, they're actually referencing the same memory location
ints would turn out the same way as in Java
stmType.pl
perl allows two kinds of referencing
symbolic reference -
can use the contents of a variable to refer to another variable!
last example: refTypes.pl
piinters -- useful for hashes of hashes, etc.
-------
Fri 10/24/08
exam - 5, 6 and perl
user -> browser (HTML) -> server - http
to browser
if all i want is text back, then there is n't realy anything to do
.. but we send form -- credit card info .. arline confirmation
so most of that is handled through a cgi gateway
common gateway interface
.. we'll use perl
CGI program -- for us in perl
.. it creates the info that comes back to the server to send back to me
the server .. compsci .. is apache in our case
browser shouldn't matter
server sends input .. sends output back to computer back to me
trying to make sure that what you right in the program we can see in the browser
mcvebm/csci322 files for this course
grab these files and put them in pubic_html .. make sure it works
Content-type: text/plain
unique_id .. different every refresh
first.pl
outputs the date
print <<EOF; # no whitespace on this command!!
stuff
stuff
stuff
EOF
^^ not the Rohm style
simpleform.html
split("&",$qstr);
$pair =~ tr/+/ /;
.. + to blank
($key, $value) = split("=", $pair); .. split on =
"pack" line
s - substitute
expressions that look like %(..) --> dot is a single character
$1 .. hex value from %(..)
"c" .. don't know what that's for
g = global .. as often as it occurs
e - evaluate .. goes with the pack function
-------
Mon 10/27/08
Exam 2 - Nov 3
chapter 8 today:
control structures
chap 9 & 10 - subprograms
chp 12 - oop
chp 15 - functional programming (sheme)
chap 16 - logic programming (prolog, lisp, snowball)
Chapter 8 - control structures:
- imperative programming languages
- assignment statemnts and variables
- control structures
- selection
- repetition
back in 60's and 70's .. came out of machine language initially
idea of repetition and selection came out of machine lagnauge which meant that the initial control statements allowed some bad things like goto and labels
1966 - paper that proves that all you need is a two-way selection (ie, if...lese)
and a logic control loop (ie, while)
if you have these two statements and assignments and such ..you can do anything
but in C++
if
if else
switch
ternary operator
for
while
do-while
perl:
-----
if
if else
if elseif else
unless
unless else
unless else if else
for
do while
do until
while
foreach
so perl has 6 conditional statemnts and at least 5 loops
so why do we have more than two
readability - if there are too few control structures, coding can get contrived/hacked
if you don't have the right kind of statement - if I only have counting loops and I need to do something until I've read a particular character .. that becomes tricky .. usually ends up not readable
too many .. well programmers typically learn only a subset
perl also has conditional modifiers .. prin ___ if defined
control structure
- control statement and the collection of statements whose execution it controls
going to go through an if statement slow and painfully
Design issues about control structures in general.
1. multiple entries? - should I have more than one way to enter a loop or body of if
2. multiple exits? - more than one way to leave loop/if
example:
if (x>5)
goto cow;
x =17;
while(x≥0) {
...
..
cow:
...
}
so i can get into the control structurre more than just through the first statement .. generall a very bad thing
not a good idea .. readablility goes down .. how do we now talk about what the while statement actually does.\
what's the precondition with two entry methods .. makes it hard to prove that a section of code does what i say it will do
- only possible with current control structures if the language has a goto
.. so no multiple entries without goto's
multiple exits, typically we dont worry about
- not considered an issue as long as thh flow of control goes outside the structure
while ( x>0)
..
..
..
break;
}
next statement
we know where we are when we get out no matter if we terminate in the usal way or jump out unexpectedly
so, OK as long as flow of control goes to statement following normal termination of the loop
.. it's stil readable .. the intent is still clear
similar issues with subprograms
- should i be able to get into the body of a subprogram from more than ohne place?
we know absolutely not .. should not be jumping into a function .. think about stack frame issue
a lot of arguement about multiple exits
when i exit where do i return to? the calling funcion .. which could be anywhere in the program
so there are peoole who scream about using multiple returns
8.2 Selection statements
if controlexpression
then clause
else clause
.. two-way selectiin statement
eval control expression
- true -> then clause
- false-> else clause
no matter what i do the next statement
all languages execute them in the same way
the difference s in the syntax
if (expr)
statement;
else
statement;
if (expr) {
stat