[imc-cms] CMS + Techmeet + Malandro

ryan ryan at linefeed.org
Sat Jul 26 19:03:19 PDT 2008


Earlier today, I made some notes about Malandro and I put them next to 
your README in a file called README.ryan ... it talks about some 
differences in the way that I envisioned the ICE layer and what was in 
the README... not big differences... but, I might as well put it here, 
too. (I tarballed it and attached it to the Malandro page on 
dev.bunke.indymedia.org)

Here is what I see the ICE layer doing:

* monitoring (checking the health of the other layers),

* services for the front-end (get this from db/cache, put this in 
db/cache, etc),

* object store (it caches to memory & disk; objects can be sessions, 
articles, users, media),

* a mechanism for propagating media to media-serving "clusters" (hosted 
on the web tier boxes, or maybe web tier boxes that are specifically for 
static content).

an example would be:

1) there is media01.indymedia.org, media02.indymedia.org, etc.
2) someone from houston uploads bush.JPG ... the web server they used 
pushes bush.JPG to its own local media directory (maybe it is 
media04.indymedia.org, so immediately the file is available at 
http://media04.indymedia.org/houston/bush.JPG).
3) in the background, the webserver sends bush.JPG to the ICE layer to 
propagate.
4) the ice layer first starts propagating the media file within the
"cluster" that the originating webserver belonged to, let's say 
mediacluster02 (media04 - media08). so ICE sends bush.JPG to 
media05.indymedia.org, media06.indymedia.org, media07.indymedia.org &
media08.indymedia.org). so, now this media object is "available" on any 
webserver in mediacluster02 and ICE will give a page cache to each 
webserver that uses 1 of the 5 media0x.indymedia.org webservers,
distributing the load. the reason for having "clusters" is so that we 
don't have to wait for the file to propagate alllll over the network 
before giving out the cache files from ICE to the webservers.

* ICE can be a smarter replacement for RR DNS for the database tier,

* other stuff like this. i dont think that the producer lives on the ICE 
tier or that article/page html is generated on the ICE tier. that's why 
we want a widely-used front end that can be extended with CakePHP (which 
gives us rapid web application development + a large library of reusable 
code from its own big developer community)

if we write all of our article generation code in python, not only do we 
lose all of those advantages but we're also writing from scratch a big 
piece of what the CMS does (writing from scratch being something we want 
to avoid as much as possible).

i agree that we have to write scripts that support ICE but they should 
be scripts that support ICE in its role as a dispatcher, monitor, 
database pool (sort of), object store and a service provider for the web 
tier.

i see the web tier as being: a) boxes that host a bunch of virtual hosts 
serving out stuff from a light-weight and extensible web application and 
b) serving static content (and maybe both, if the box/bandwidth for that 
particular machine is strong enough).

> a.) rsync replacement

Why not rsync for everything?

> c.) Frontend language
> If you run a single-server setup where you would not need ice, you  
> run into the problem that you have to stick with one language. It  
> would not be possible to run PHP as Frontend and Python as backend.

I think the single-server setup is something we think about after we've 
got -this- figured out (and, this graphic isn't entirely right, I need 
to make a replacement because there should be many servers on each tier 
but that's expressed in the text on that page):
http://dev.bunke.indymedia.org/attachment/wiki/TechOverview/indyarch.jpg


> Well, ICE works by making a abstraction of your Objects and  
> Functions, and it is hard to share the same objects and functions  
> between PHP and Python. (ICE makes a Serialization of the Functions  
> and Objects).

Sharing functions between the front & back end is interesting but I 
don't know if we need to do that right away... but, we do need to share 
objects. But I don't think we're limited to a single language...

Slice works with IcePHP with simple built-in types (boolean, integer, 
string, etc), enumerators (although I don't think we'll use that), 
classes (structures), indexed arrays (sequences), associative arrays 
(dictionaries), constants, exceptions, etc. I think we won't need Slice 
types outside of this list, right?

If we -really- have to (I don't think we do), we can just take our PHP 
objects and make them Python objects before sending them along to the 
ICE layer - http://www.ohloh.net/projects/10100


> d.) Alternative to ICE

> In ICE you make
> 
> article = new IceArticle()	// this is nothing more then a empty class  
> that holds your article
> article.title = $POST['title']
> ice.server.publish(article)	// ice provides you with the  
> serialization of the data and the functions

I think you have ICE doing waaay too much stuff here, this isn't what we 
need ICE for. We ICE to:

1- Make it easy to send objects from our PHP front-end to our Python 
middleware (which puts it in the by knowing where our "insert/update 
db's" are) or request stuff from our Python middle (which knows where 
the "read-only" db's are or knows if we have it available in a local 
cache), etc.

2- It makes it -really- easy to build a distributed network of front and 
back end servers with all the cool functionality that comes with 
IceGrid. We probably won't start out using even 1/3 of the great stuff 
we can do with IceGrid but eventually, we -can- use it (IceGrid even has 
a cool Java GUI to administer the grid!!).

3- It boosts the power of PHP by basically giving us a tight, 
distributed app server.

4- In general, ICE is providing us with a robust, well-designed, 
efficient protocol for making a huge distributed network on the internet.

For me, that's the majority of what ICE is going to do for us. The main 
"network functions" I see ICE providing us with are storing/retrieving 
stuff from the db (sessions/articles/etc), providing a cache of the 
same, etc. Maybe I'll make a visual representation of what I'm trying to 
express on this page:
http://dev.bunke.indymedia.org/wiki/ArchitectDiscussion

I think ICE beats up everything else we could use.. :)

> If we stick with the simplified "3 Tier Model" of Frontend, Publisher  
> and Backend
> 
> - What Functions and Objects will each Tier Provide?

TIER #1: FRONT-END

front-end (Type #1 - Application) = CMS software, publishes & renders 
("produces") pages that contain uploaded media, generates any pages that 
absolutely can't be cached.

front-end (Type #2 - Static) = runs a webserver that is faster than 
apache, servers up static media that users have uploaded

--------------------------

TIER #2: MIDDLE TIER

ICE server = maintains network grid, provides services to Tier #1, acts 
as a middle-man between Tier #1 & Tier #3, load balancer, object sore, 
etc from above

--------------------------

TIER #3: DATABASE TIER

For this, I think we ask -- billf has seen this deployed in a couple 
different ways and in a bunch of mega-traffic situations. We could do 
the standard replication model but we could also attempt to do 
geographically-distributed clustering ... no idea how well this is 
working in real life. I think we would start with the standard model 
that we know works -- and then attach a cluster to it and see how it 
performs. That way we get to try it out without risking anything.

> - How will the Interfaces between the Tiers look like?

Between Tier #1 & Tier #2, ICE
Between Tier #2 & Tier #3, standard MySQL networking
Between Tier #1 & Tier #3, -never happens-



More information about the imc-cms mailing list